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LF is a dependent type theory in which many other formal systems can be conveniently embedded. 
However, correct use of LF relies on nontrivial metatheoretic developments such as proofs of 
correctness of decision procedures for LF's judgments. Although detailed informal proofs of these 
properties have been published, they have not been formally verified in a theorem prover. We 
have formalized these properties within Isabelle/HOL using the Nominal Datatype Package, closely 
following a recent article by Harper and Pfenning. In the process, we identified and resolved a 
gap in one of the proofs and a small number of minor lacunae in others. We also formally derive 
a version of the type checking algorithm from which Isabelle/HOL can generate executable code. 
Besides its intrinsic interest, our formalization provides a foundation for studying the adequacy 
of LF encodings, the correctness of Twelf-style metatheoretic reasoning, and the metatheory of 
extensions to LF. 

Categories and Subject Descriptors: F.4.1 [Mathematical Logic and Formal Language]: 

Mathematical Logic — Lambda calculus and related systems 

General Terms: Languages, theorem provers 

Additional Key Words and Phrases: Logical frameworks. Nominal Isabelle 



1. INTRODUCTION 

The (Edinburgh) Logical Framework (LF) is a dependent type theory introduced 
by Harper, Honsch and Plotkin [1993] as a framework for specifying and reasoning 
about formal systems. It has found many applications, including proof-carrying 
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code [Necula 1997]. The Twelf system [Pfenning and Schiirmann 1999] has been 
used to mechanize reasoning about LF specifications. 

The cornerstone of LF is the idea of encoding judgments- as-types and proofs-as- 
terms whereby judgments of a specified formal system are represented as LF-types 
and the LF-terms inhabiting these LF-types correspond to valid deductions for these 
judgments. Hence, the validity of a deduction in a specified system is equivalent to 
a type checking problem in LF. Therefore correct use of LF to encode other logics 
depends on the proofs of correctness of type checking algorithms for LF. 

Type checking in LF is decidable, but proving decidability is nontrivial because 
types may contain expressions with computational behavior. This means that type- 
checking depends on equality-tests for LF-terms and LF-types. Several algorithms 
for such equality-tests have been proposed in the literature [Coquand 1991; Goguen 
2005b; Harper and Pfenning 2005]. Harper and Pfenning [2005] present a type- 
driven algorithm that is practical and also has been extended to a variety of richer 
languages. The correctness of this algorithm is proved by establishing soundness 
and completeness with respect to the definitional equality rules of LF. These proofs 
are involved: Harper and Pfenning's detailed pencil-and-paper proof spans more 
than 30 pages, yet still omits many cases and lemmas. 

We present a formalization of the main results of Harper and Pfenning's article. 
To our knowledge this is the first formalization of these or comparable results. While 
most of the formal proofs go through as described by Harper and Pfenning [2005] , 
we found a few do not go through as described, and there is a gap in the proof of 
soundness. Although the problem can be avoided easily by adding to or changing 
the rules of Harper and Pfenning [2005] , we found that it was still possible to prove 
the original results, though the argument was nontrivial. Our formalization was 
essential not only to find this gap in Harper and Pfenning's argument, but also to 
find and validate the possible repairs relatively quickly. 

We used Isabelle/HOL [Nipkow et al. 2002] and the Nominal Datatype Pack- 
age [Urban et al. 2007; Urban and Tasson 2005; Urban 2008] for our formalization. 
The latter provides an infrastructure for reasoning conveniently about datatypes 
with a built-in notion of alpha-equivalence: it allows to specify such datatypes, 
provides appropriate recursion combinators and derives strong induction principles 
that have the usiial variable convention already b\iilt-in. The Nominal Datatype 
Package has already been used to formalize logical relation arguments similar to 
(but much simpler than) those in Harper and Pfenning's completeness proof [Nar- 
boux and Urban 2007] ; it is worth noting that logical relations proofs are currently 
not easy to formalize in Twelf itself, despite the recent breakthrough by Schiirmann 
and Sarnat [2008]. 

Besides proving the correctness of their equivalence algorithm, Harper and Pfen- 
ning also sketched a proof of decidability. Unfortunately, since Isabelle/HOL is 
based on classical logic, proving decidability results of this kind is not straightfor- 
ward. We have formalized the essential parts of the decidability proof by providing 
inductive definitions of the complements of the relations we wish to decide. It is 
clear by inspection that these relations define recursively enumerable sets, which 
implies decidability, but we have not formalized this part of the proof. A complete 
proof of decidability would require first developing a substantial amount of com- 
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putability theory within IsabeUe/HOL, a problem of independent interest we leave 
for future work. 

We followed the arguments in Harper and Pfenning's article very closely using 
the Nominal Datatype Package for our formalisation, but the current system does 
not allow us to generate executable code directly from definitions involving nom- 
inal datatypes. We therefore also implemented a type-checking algorithm based 
on the locally nameless approach for representing binders [McKinna and Pollack 
1999; Aydemir et al. 2008]. We proved that the nominal datatype formalization of 
Harper and Pfenning's algorithm is equivalent to the locally nameless formulation. 
Moreover, by making the choice of fresh names explicit, we can generate a working 
ML implementation directly from the verified formalization. 

Outline. We first briefly review LF and its representation in the Nominal Datatype 
Package (Sec. 2). In Sec. 3, we report on our formalization. To ease comparison. 
Sec. 3 follows the structure of Harper and Pfenning [2005] closely, although this 
article is self-contained. Sections 3.1-3.5 summarize our formalization of the basic 
syntactic properties of LF and soundness and completeness of the equivalence and 
typechecking algorithms. We discuss additional lemmas, proof details, and other 
complications arising during the formalization, and discuss the gap in the soundness 
proof and its solutions in detail. The remainder of Sec. 3 reports upon formaliza- 
tions of additional results whose proofs were only sketched by Harper and Pfenning 
[2005]. These include 

(1) the admissibility of strengthening and strong extensionality rules (Sec. 3.6), 

(2) a partial formalization of decidability of algorithmic typechecking for LF, and 
a discussion of the current limitations of IsabeUe/HOL in formalizing proofs 
about decidability (Sec. 3.7), 

(3) the existence and luiiqucncss of quasicanonical forms (Sec. 3.8), and 

(4) a partial formalization of an example proof of adequacy (Sec. 3.9), and a dis- 
cussion of complications in the proof sketched in [Harper and Pfenning 2005] . 

In Sec. 4 we define and verify the correctness of a type checking algorithm based 
on the locally nameless representation of binders, from which IsabcUe/HOL can 
generate executable code. This amounts to a verified typecheckcir for LF, an original 
contribution of this article. Sec. 5 summarizes the authors' experience with the 
formalization, Sec. 6 discusses related and future work and Sec. 7 concludes. 

Contributions. The metatheory of LF is well-understood: it had been studied for 
many years before the definitive presentation in Harper and Pfenning [2005] . Their 
main results were not in serious doubt, and formalizing such work might strike 
some readers as perverse or pedantic. Nevertheless, our formalization is an original 
and significant contribution to the study of logical frameworks and mechanized 
metatheory, because: 

(1) it tests the capabilities of the Nominal Datatype Package for formalizing a large 
and complex metatheoretical development, 

(2) it provides high confidence in algorithms that are widely trusted but have never 
been mechanically verified, 

(3) it elucidates a few subtle issues in the basic metatheory of LF, and 

(4) it constitutes a re-usable library of formalized results about LF, providing a 
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foundation for verification of Twelf-style meta-reasoning about LF specifica- 
tions, extensions to LF, or related type theories that are not as well-understood. 
This article is a revised and extended version of a previous conference paper 
presenting our initial formalization of the metatheory of LF [Urban et al. 2008]. 
The formal development described by this article can be obtained by request from 
the authors, and is available at http://isabelle.in.tum.de/noininal/LF/. 

2. BACKGROUND 

This article assumes some famiUarity with formalization in Isabcllc/HOL and its 
ML-like notation for fimctions and definitions. We used the Nominal Datatype 
Package in Isabcllc/HOL [Urban et al. 2007; Urban and Tasson 2005; Urban 2008] 
to formalize the syntax and judgments of LF. The key features we rely upon are 

(1) support for nominal datatypes with a built-in notion of binding (i.e. a-equivalence 

classes), 

(2) facilities for defining functions over nominal datatypes (such as substitution) 
by (nominal) primitive recursion, and 

(3) strong induction principles for datatypes and inductive definitions that build 
in Barendregt-style renaming conventions. 

Together, these features make it possible to formalize most of the definitions and 
proofs following their paper versions closely. We will not review the features of 
this system in this article, but will discuss details of the formalization only when 
they introduce complications. The interested reader is referred to previous work on 
nominal techniques and the Nominal Datatype Package for further details [Gabbay 
and Pitts 2002; Pitts 2006; Urban et al. 2007; Urban and Tasson 2005; Urban 2008]. 

2.1 Syntax of LF 

The logical framework LF [Harper et al. 1993] is a dependently-typed lambda- 
calculus. We present it here following closely the article by Harper and Pfen- 
ning [2005] , to which wc refer from now on as HP05 for brevity. The syntax of LF 
includes kinds, type families and objects defined by the grammar: 

Kinds K, L ::= type \ Ux:A. K 

Type families A, B ::= a \ Ilx:Ai. A2 \ A M 
Objects M, N ::= c | a; j Xx:A. M \ Mi M2 

where variables x and constants c and a are drawn from countably infinite, disjoint 
sets Var and Id of variables and identifiers, respectively. Traditionally, LF has 
included A-abstraction at the level of both types and objects. However, Geuvers 
and Barendsen [1999] established that type-level A-abstraction is superfluous in LF. 
Accordingly, HP05 omits type-level A-abstraction, and so do we. 

We formalize the syntax of LF using nominal datatypes since the constructors A 
and n bind variables. Substitutions are represented as lists of variable- term pairs 
and we define capture avoiding substitution in the standard way as 
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x[a] = lookup a x 

c [cr] = c 

{M N)[(7] = M[<t] N[(t] 

{Xy.A. M)[a\ = Xy:A[a]. M[a] provided y # cr 

a[a] = a 

{A M)[a] = A[a] M[a] 
{Ily.A. B)[a] = lly:A[a]. B[a] provided y # a 

type[a] = type 
{Uy.A. K)[a] = Uy:A[a]. K[a] provided y # cr 

where the variable case is defined in terms of the auxihary function lookup: 

lookup [] X = X 

lookup {{y, M)::cr) x = {if x = y then M else lookup a x) 

The side-conditions 2/ # cr in the above definition are freshness constraints provided 
automatically by the Nominal Datatype Package and stand for y not occurring 
freely in the substitution cr. Substitution for a single variable is defined as a special 

case: {-)[x:=M] (-)[(a;,M)]. 

We use ML-like notation [] for the empty list and x :: L iov list construction. LF 
includes signatures S and contexts F, both of which wc represent as lists of pairs. 
The former consist of pairs of the form (c, A) or (a, K) associating the constant c 
with type A and the constant a with kind K respectively, and the latter consists of 
pairs (x, A) associating the variable x with type A. Accordingly, wc write (x, A)::r 
for context construction (rather than r,x:A), F @r' iov context concatenation and 
(x. A) F for context membership (similarly for S). Context inclusion is defined 
as follows: 

Fi C F2 =Wa; A. (a;. A) e Fi implies {x, A) e F^ 
2.2 Validity and Definitional Equivalence 

HP05 defines two judgments for identifying valid signatures and contexts, which we 
formalize in Fig. 1. In contrast with HP05, we make explicit that the new bindings 
do not occur previously in S or F, using freshness constraints such as a; # T. We 
also make the dependence of all judgments on S explicit. 

Central in HP05 are the definitions of the validity and definitional equivalence 
judgments for LF, and of algorithmic judgments for checking equivalence. The 
validity and definitional equivalence rules arc shown in Fig. 2 and 3. There arc 
three judgments for validity and three for equivalence corresponding to objects, 
type families and kinds respectively: 

Objects Type families Kinds 

Validity F^s M : A F A : K F K : kind 

Equivalence F \-s M = N : A F \-e A = B : K F \-e K ^ L : kind 

These six judgments are defined simultaneously with signature validity (I- E sig 
) and context validity {hjj F ctx ) by induction. We added explicit validity hy- 
potheses to some of the rules; these are left implicit in HP05. We also added some 
(redundant) freshness constraints to some rules in order to be able to use strong 
induction principles [Urban et al. 2007]. 
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h E sig 

h E sig K : kind a # E \- E sig [] A : type 

h (a, K)::E sig h (c, A)::E 



\- E sig \-r r cLr 7" .1 : Ljipc x # F 

\-s [] ctx \-s {x, A)::r ctx 

Fig. 1. Validity rules for signatures and contexts 



\-s r ctx {x, A) e r \-s r ctx (c, A) e e 

r hs X : A r \-s c : A 

r\-s Ml : Ux:A2. Ai T Ma : x # T 
r Ml M2 : Ailx:=M2] 

r\-s Ai: type {x, Ai)::r hs M2 : A2 x# (r, ^1) 
r \-E Xx-.Ai. M2 : Ux-.Ai. A2 

r \-s M : A r hs A = B : type 
r \-s M : B 

r \-s A : K 

hs r ctx {a, K) e E 
r \-s a : K 

r A : lljr.n. K I' hj: M : U :r # i' 
r \-s A M : K[x:=M] 

r \-s Al : type {x, Ai)::r A2 : type x # {F, Ai) 
r hs Hx-.Ai. A2 : type 

r\-s A: K r\-E K = L: kind 
r^s A: L 

r 'rs K : kind 

hj; r ctx 

r type : kind 

r \-E A : type {x, A)::r K : kind x # (r, A) 
r'rs Ux-.A. K : kind 

Fig. 2. Validity rules for kinds, type families and objects. 



h sig 

r ctx 



r \-s M : A 
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r \-s M = N : A 

\-E r ctx (x, A) e r \-s r ctx (c, A) e i: 

r \-s X = X : A r \-s c = c : A 

r \-s Ml = Ni : Ux:A2. Ai F hs M2 = N2 ■ A2 x # T 
r \-s Ml M2 = Ni N2 : Ai[x:=M2] 

r Ai' = Ai : type 
r \-s Ai" = Ai : type F hs Ai : type (x, Ai)::r M2 = N2 : A2 x # T 
r \-s Xx-.Ai'. M2 = Xx:Ai". N2 ■■ Ux-.Ai. A2 

r M : llx:Ai. A2 
r N : IL.f.Ai. A2 r hv; ,li : lijpc (.,:, Ai)::!' M i' = N .1: : A2 x # T 
r \-E M = N : Ux-.Ai. A2 

r\-s Ai- type (x, Ai)::r hjj M2 = N2 : A2 F hjj Mi = Ni : Ai x # T 
r \-s (Xx:Ai. M2) Ml = N2[x:=Ni\ : A2[x:=Mi\ 

r M = N : A r \-s M = N : A F hs N = P : A 
r \-s N = M : A r \-s M = P : A 

r \-s M = N : A r hs A = B : type 
r \-s M = N : B 

r \-s A = B : K 

hs r ctx {a, K) e s 
r a = a : K 

r\-s A = B : Ux:C. K P \-s M = N : C x # P 
P A M = B N : K[x:=M] 

P \-s Ai = Bi : type P Ai : type (x, Ai)::P A2 = B2 ■ type x#P 
P Hx-.Ai. A2 = Ilx-.Bi. B2 ■■ type 

P A = B : K P A = B : K P hs B = C : K 
P \-s B = A : K P \-s A = C : K 

P A = B : K P\-s K = L: kind 
P \-s A = B : L 

P K = L : kind 

\-s P ctx 
r 'rs iyps = type : kind 

P 'rs A = B : type P 'rs A : type {x, A)::P hs K = L : kind x # P 
P ha Ux:A. K = nx:B. L : kind 

P K = L : kind P hs K = L : kind P L = L' : kind 
P\-s L = K : kind P \-s K = L' : kind 

Fig. 3. Definitional equivalence rules for kinds, type families and objects. 
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2.3 Algorithmic Equivalence 

The definitional equivalence judgment captures equivalence between LF terms, 
types and kinds declaratively, but it is highly nondetcrministic due to the sym- 
metry, transitivity and conversion rules. Accordingly, HP05 introduces algorithmic 
equivalence judgments that are type- and syntax-directed, and the main contri- 
bution of that article is the proof that the algorithmic and declarative systems 
coincide. 

A crucial point of the algorithm in HP05 is that it does not analyze the precise 
types of objects or kinds of types during cquivak;ncc checking; rather it only uses 
approximate simple types r and simple kinds k defined as follows: 

T ::= a~ I r — > r' k ::= type^ | r — > k 

This simplification is suflScient for obtaining a sound and complete equivalence 
checking algorithm, and also simplifies the proof development in a number of places. 

Similarly, simple contexts A, Q consist of lists of pairs (x, r) of variables and 
simple types. We write h A sctx to indicate that A is valid, i.e. has no repeated 
variables, and write A > A' to indicate that A contains all of the bindings of A' 
and A is a valid simple context. 

Finally, we also introduce simple signatures, also written S, consisting of lists of 
pairs (c, r) or (a, k) of constants and simple kinds or types. We write h S ssig 
to indicate that is a well-formed simple signature with no repeated type or kind 
assignments. 

The erasure function translates families and kinds to simple types and simple 
kinds: 

(A S- I "a- (type)' = type- 

Similarly, we write r~ for the simple context resulting from replacing each bind- 
ing {x, A) in 7^ with {x, A~). Likewise, we extend the erasure function to map 
signatures S to simple signatures IJ~ in the natural way. 

The rules for the algorithm also employ a weak head red,uction relation (— ) (— ) 
which performs beta-reductions only at the head of the top-level application of a 
term. It is defined as 



{Xx:Ai. M2) Ml ^ M2[x:=Mi] Mi M2 ^ M^' 

The rules for the equivalence checking algorithm are given in Fig. 4. There are 
five algorithmic equivalence judgments: 

Objects Type families Kinds 

Algorithmic l^^^s M ^ N : t /^^s A ^ B : k, A \-e K <^ L : kind' 
Structural W-^ M ^ N -.t A.\-e A ^ B : k 

Note that the algorithmic; rules are type- (or kind-) directed while the structural 
rules are syntax-directed. 

The main results of HP05 are soundness and completeness of the algorithmic 
judgments relative to the equivalence judgments: 
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M N : T 

M^M' Ahs M' N : a- 
A \-s M N : a~ 

N' Ahs M N' : a- 
A hs M <^ N : a~ 

A \-s M -H- N : a~ 
A\-s M ■i^ N : a~ 

(s, ri)::A M x ^ N x ■.t2 x # (A, M, AT) 
A M O AT : Ti -»■ r2 



A hi; M -H- JV : T 

(s, r) e A I- A sctx h E ssig 

A hj; X -Ir^ X : T 

{c, t) E S I- A sctx h 17 ssjff 
A c c : T 

A hi; Ml -H- iVl : T2 ^ Tl A hi; M2 ^ N2 : T2 

A \-s Ml M2 -(r^ Ni N2 : n 

A hi; >1 S : K 

A hi; j4 B : ti/pe" 
A hi; ^ -i^ B : ti/pe~ 

(x, t)::A hi; ^ s «4> B s : k x # (A, A, S) 

A hs A ^ B : T ^ K 

A Ai <^ Bi : type- {x, Ai-)::A A2 ^ B2 ■ type' x # (A, Ai, Bi) 
A hi; Ux-.Ai. A2 O Ux:Bi. B2 ■ type~ 

A \-s A -H- B : K 

{a, k) & E h A sctx h S ssig 
A hi; a a : K 

A A -ir^ B : T ^ K A hs M <^ N : T 
A\-sAM^BN: K 

A hi; if <f» L : kind~ 

h A scix h E ssig 
A hi; type 4^ type : kind~ 

A hi; A <^ B : type~ {x, A-)::A K L : kind~ x # (A, A, B) 
A hi; Ux:A. K Ux:B. L : kind~ 

Fig. 4. Algorithmic equivalence rules 
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Theorem 1 (Completeness). 

(1) If r \-E M = N : A then p- M ^ N : A'. 

(^) If r A = B : K then p- h ^- A ^ B : R- . 
(5) If r^E K = L: kind then h^- if <^ L : kind.-. 

Theorem 2 (Soundness). 
{!) If r- M ^ N : A- and r \-s M : A and F \-s N : A then 
r M = N : A. 

(2) If r- \-^- A B : K- and r A : K and r B : K then 
r A = B : K. 

(5) // r~ h^- K ^ L : kind~ and F K : kind and F \-z; L : kind then 
r^s K = L : kind. 

In what follows, we outline the proofs of these results and discuss how we have 
formalized them, paying particular attention to places where additional lemmas or 
different proof techniques were needed. We also discuss the gap in the soundness 
proof of HP05, along with several solutions. 

3. THE FORMALIZATION 
3.1 Syntactic properties 

The proof in HP05 starts by developing of a number of useful metatheoretic prop- 
erties for the validity and equality judgments (shown in Fig. 2), such as weakening, 
substitution, generalizations of the conversion rules and inversion principles. Most 
of these properties have multiple parts corresponding to the eight different judg- 
ments in the definitional theory of LF. We will list the main properties; however, to 
aid readability wc will only show the statements of most of these properties for the 
object-level judgments, and we omit symmetric cases. The full formal statements 
of the syntactic properties can be found in the electronic appendix. 

To prove the main syntactic properties wc needed two technical lemmas having 
to do with the implicit freshness and validity assumptions that must be handled 
explicitly in our formalization. Both are straightforward by induction, and both 
are needed frequently. 

Lemma 1 (Freshness). // .t # r and F 'rs M -. A then x 41^ M and x # A. 
Similarly, if x # F and F hjj M = N : A then x # M and x # N and x # A. 

Lemma 2 (Implicit Validity). If F \-s M : A or F \-s M = N : A then h 
E sig and F ctx. 

Lemma 3 (Weakening). Suppose \-s F^ ctx and Fi C F2. 
(i) If Fi^E M : A then F2^s M : A. 
{2) If Fihs M = N : A then Fi^s M = N : A. 

Lemma 4 (Substitution). Suppose r2 P : C and let F = Fi @[{y, C)\ @F2. 

(1) If F ctx then Fi[y:^P] @F2 ctx. 

(2) If F'rs M : B then Fi[y:=P] @F2 M[y:=P] : B[y:=P]. 

(3) If F M = N : A then Fi[y:=P] @F2 M[y:=P] = N[y:=P] : A[y:=P]. 

Lemma 5 (Context Conversion). Assume that F \-s B : type and F \-s 
A = B : type. Then: 
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(i) // (x, A)::r \-s M : C then {x, B)::r M : C 
{2) If {x, A)::r \-e C : K then {x, B)::r \-s C : K 

Lemma 6 (Functionality for Typing). Assume that F \-e M : C and 
r N : C and r M = N : C. Then if T' @ [{y, C)] @ T \-e P : B then 

r'[y:=M] @r P[y-=M] = P[y:=N] : B[y:=M]. 

Since our judgements contain explicit validity hypotheses for contexts, the proof of 
Lem. 6 relies on the fact that functionality holds also for contexts, namely 

Lemma 7 (Functionality for Contexts). // F' @[{x, C)] @r ctx 
and F^E M : C then ^e F'[x:=M] @F ctx. 

This fact can be established by induction on F'. 

Lemma 8 (Validity). Objects, types and kinds appearing in derivable judg- 
ments are valid, that is 

(i) If F'te M : A then F ^e A : type. 

{2) If F^E M ^ N : B then F ^ e M : B and F ^ e N : B and F h e B : type. 

Lemma 9 (Typing inversion). The validity rules are invertible, up to conver- 
sion of types and kinds. 

(1) If F^E X : A then 3B. (x, B) e F and F \-e A = B : type. 
{2) If F ^E c : A then 3B. (c, B) G S and F ^e A =^ B : type. 
(3) If F ^E Ml M2 : A then 3x Ai A2. F\-e Mi: Ylx:A2. Ai and F \-e M2 : 

A2 and F \-E A = Ai[x:=M2] : type- 
{4) If F ^E Xx:A. M : B and X # F then 3A'. F ^e B = Ilx:A. A' : type and 

F \-E A : type and {x, A)::r \-e M : A'. 

Next HP05 established some inversion and invertibility properties for definitional 
equality: 

Lemma 10 (Equality inversion). 
{!) If F \-E type = L : kind then L = type. 

{2) If F \-E A = Ux-.Bi. B2 : type and x # F then 3Ai A2. A = Ux-.Ai. A2 and 

F \-E Ai = Bi : type and {x, Ai)::F \-e A2 = B2 '. type. 
(5) If F ^E K = Ilx-.Bi. L2 : kind and x F then 3Ai K2. K = I\.x:Ai. K2 
and F \-E Ai = Bi : type and {x, Ai)::F \-e K2 = L2 '■ kind. 

Finally, we can prove that the product type constructor is injective up to defini- 
tional equality, which is needed for soundness: 

Lemma 11 (Product in.jectivity) . Suppose x # F. 

(1) If F hE Ilx-.Ai. A2 = Ux-.Bi. B2 : type then F ^e Ai ^ Bi : type and 
{x, Aiy.:F hE A2 = B2 : type. 

(2) If F 'te T1x:A. K = Ilx-.B. L : kind then F ^e A = B : type and 
{x, A)::F \~e K = L : kind. 

All the metatheoretic properties given above can be proved as stated in HP05 
(appealing to Lem. 1 and 2 as necessary); however, since all of the definitional 
judgments of LF are interdependent, each inductive proof must consider all 35 
cases, making each proof nontrivial as a practical matter (it is one of the biggest 
parts of our formalization). 
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HP05 organize the proofs of these metatheoretic properties very neatly. For 
example as shown in Lem. 8 the validity judgment of terms implies the validity of 
the type. However, in order to establish this a number of auxiliary facts have to 
be proved first which depend on this property. In order to get the proof through, 
some of HPOS's rules given in Fig. 2 are formulated to explicitly include validity 
constraints such as 7^ hs A : type and F \-s K : kind. After proving the above 
properties, however, we can show that these extra hypotheses are not needed, by 
establishing stronger forms of the rules: 



{1) 
{2) 
(3) 



Lemma 12 (Strong versions of rules). The following rules are admissible: 
r^s Ml : T{x:A2. Ai F M2 : A2 



r \-s Ml M2 : Ai[x:=M2] 
r A : Ux:B. K F M : B 

r\-j: A M : K[x:=M] 

{x, Ai)::r M2 = N2 : A2 F Mi = Ni : Ai x#r 



r {Xx:Ai. M2) Ml = N2[x:=Ni] : A2[x:=Mi] 
3.2 Algorithmic equivalence 

The main metatheoretic properties of algorithmic equivalence proved in Sec. 3 of 
HP05 are symmetry and transitivity. Several properties of weak head reduction 
and erasure needed later in HP05 are also proved. Most of the proofs were straight- 
forward to formalize, given the details in HP05 (where provided). However, there 
were a few missing lemmas and other complications. The algorithmic system is 
less well-behaved than the definitional system because derivable judgments may 
have ill-formed arguments; for example, the judgment [] {Xx:a. c) y c : b~ 

is derivable, for any object term y, provided that (c, b) G S since {Xx:a. c) y 

c. Thus, analogues of Lem. 1 and 2 do not hold for the algorithmic system, and in 

rules involving binding wo need to impose additional freshness constraints. More- 
over, proof search in tlic algorithmic system is not necessarily terminating because 
(— ) (— ) may diverge if called on ill-formed terms such as {Xx:a. x x) {Xx:a. x 

X). 

The erasure preservation lemma establishes basic properties of erasure which are 
frequently needed in HP05: 

Lemma 13 (Erasure preservation). 
{1) If r A = B : K then A- = B' . 
I2) If r^s K = L: kind then R- = L". 
(5) // {x, A)::r B : type then 5" = B[x:=M]- 
{4) If {x, A)::r \-s K : kind then K' = K[x:=M]- 

However, we found that the hypotheses of parts 3 and 4 are unnecessary. Indeed, 
we can easily prove: 

Lemma 14 (Erasure cancels substitution). For any type family A, kind 
K, and substitution a, we have 
{1) A[a]-=A- 
{2) K[a]- = K- 
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In the proofs of symmetry and transitivity of the algorithmic judgments (Thm. 3 
and Thm. 4), we also needed the following algorithmic erasure preservation lemma 
(it is omitted from HP05, but straightforward by induction): 

Lemma 15 (Algorithmic erasure preservation). 
(i) If Ahs A B : K then A- = S". 
{2) If A^s A ^ B : K then A- = S". 
(5) If A^E K ^ L : kind- then K' = L'. 

The dctcrminacy lemma establishes several important properties of weak head 
reduction and algorithmic equivalence. 

Lemma 16 (Determinacy). Suppose that \- S ssig and h A sctx. 

(1) If M' and M" then M' = M". 
{2) If Ahs M 4^ N : T then $M'. M'. 
(5) If Ahs M ^ N : T then $N'. N'. 

U) If A M ^ N : T and A \-s M ^ N : t' then r = r'. 
(5) If A \-s A 4^ B : K and A A <^ B : k' then k = k'. 

However, we needed generalized forms of parts 4 and 5 in the proof of transitivity 
(Thm. 4). These properties are also later used in Thm. 13 in proving decidability 
of the algorithmic rules. 

Lemma 17 (Generalized determinacy). Suppose that \- S sig and\- A sctx. 
{1) If A \-s M N : T and A N ^ P : t' then r = r'. 

(2) If A\-s A ^ B : K, and A\-j: B ^ G : k' then k = k'. 

Verifying symmetry of the algorithmic judgments is then straightforward, using 
properties established so far. 

Theorem 3 (Symmetry of algorithmic equivalence). 

1. If A\-E M <^ N : T then A^s N ^ M : t. 

2. If A\-E M ^ N : T then A\-s N ^ M : t. 

3. If A^E A ^ B : K then A^e B ^ A : k. 

4. If A\-E A <^ B : K then A\-e B ^ A : k. 

5. If A \-E K L : kind~ then A \-e L 4^ K : kind~. 

However, verifying transitivity required more work. 

Theorem 4 (Transitivity of algorithmic equivalence). Suppose that 

h S ssig and h A sctx. 

(1) If A^E M ^ N ■ T and A^E N <^ P : T then A^e M ^ P : t. 
{2) If A \-E M ^ N : T and A h E N ^ P : T then A^e M ^ P : t. 

(3) If A \-E A ^ B : K and A \-E B ^ C : K. then A\~e A ^ C : k. 
{4) If A^E A ^ B : K and A\-E B ^ C : K then A'^e A ^ C : k. 

(5) If A\-E K <:i>- L: kind~ and A\-e L -i^ L' : kind~ then A hi; /<' L' : kind~ . 

Proof. As described in HP05, the proof is by simultaneous induction on the 
two derivations. For types and kinds, this simultaneous induction can be avoided 
by performing induction over one derivation and using inversion principles. For the 
object-level judgments (cases 1 and 2), we formalize this argument in Isabelle by 
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A M = JV G la"! 
A M = AT e |r -> tTI 

A\-s A = B £ Itype'j 
Ahs A = B £ {t ^ K.} 

A\-s K = L e Ikind-j 

A [] = [] e [[]] 

A \-s {x, M)::a = (x, JV)::6> e [(i, r)::G] 



A M AT : a- 

VA' > A, M', N'. A' l-£ M' = N' e [r] 

impZies A' M M' = # #' e [r1 

A A B ; ti/pe~ 

V A' > A, M', N'. A' hi; M' = N' e [t] 

implies A' \-s A M' = B N' e M 

A\-s K <^ L : kind~ 

True 

A o- = e [61 and a; # O 
and A Hi; M = AT e [t] 



Fig. 5. Logical relation definition 



defining object-level algorithmic judgments instrmnented with a height argument, 
and prove parts 1 and 2 by well-founded induction on the sum of the heights of the 
derivations. 

Because we use induction over the height of the instrumented derivation, we 
cannot take advantage of the "strong" induction principles for algorithmic deriva- 
tions [Urban et al. 2007]. As a result, there are several cases where we need to 
perform some explicit a-conversion and renaming steps; these are places in an in- 
formal proof where one usually appeals to renaming principles "without loss of 
generality" . In the current version of the nominal datatype package offers strong 
inversion principles that ameliorate this diflSculty [Berghofer and Urban 2008]. 

The generalized determinacy property (Lem. 17) is needed here in the case of 
structural equivalence of applications. □ 

Strengthening. At this point in the development, we can also prove that the algo- 
rithmic judgments satisfy strengthening; that is, unused variables can be removed 
from the context without harming derivability of a conclusion. Strengthening is 
not discussed in HP05 until later in the article, but we found it helpful in the proof 
of soundness. We first need an (easily established) freshness-preservation property 
of weak head reduction. 

Lemma 18 (Weak head reduction preserves freshness). 
If N and X # M then x # N. 

With this property in hand, strengthening for algorithmic and structural equiva- 
lence can be established by induction on the structure of judgments, making use of 
basic properties of freshness, valid contexts, and the previous lemma as necessary. 

Lemma 19 (Strengthening of algorithmic equivalence). Suppose that 
a; # (A', M, N). Then: 

(1) If A'@[{x, t')] @A^e M ^ N : t then A' @ A M ^ N : t. 

(2) If A'@ [{x, t')] @A^s M ^ N : t then A' @ A M ^ N : t. 

Proof. Straightforward induction on derivations, using properties of freshness. 
Lem. 18 is needed in the cases involving weak head reduction to maintain the 
freshness constraints needed for the induction hypothesis. □ 
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3.3 Completeness 

The proof of completeness involves a Kripke-style logical relations argument. We 
can define the logical relation for objects, types, and substitutions, by induction on 
the structure of simple types r and kinds k and simple contexts 6, respectively, as 
shown in Fig. 5. This kind of logical relation is called Kripke-style because the case 
for function types is modeled on Kripke's possible- worlds semantics for intuitionistic 
logic: it is indexed by a variable context A and in the case for function types and 
kinds, we quantify over all valid extensions to A when considering the argument 
terms M',N'. 

The key steps in proving completeness are showing that logically related terms 
are algorithmically equivalent (Thm. 5) and that definitionally equivalent terms are 
logically related (Thm. 6). Many properties can be established by an induction on 
the structure of types, appealing to the properties of the algorithmic judgments 
established in section 3 of HP05 and the definition of the logical relation. 

Lemma 20 (Logical relation weakening). Suppose A' > A. 

(1) If A^s M = N € It] then A'^s M = N € [r]. 

(2) If Ahs A = B € [k] then A' A = B e [k]. 
(5) If A\-E a = G |e] then A' \-s a = 6 G IQ]. 

Theorem 5 (Logically related terms are algorithmically equivalent). 

Suppose h A sctx. 

{!) If A\-s M = N G |r] then Ahs M ^ N : t. 
{2) If A\-s M ^ N : T then A^s M = N G [tJ. 

(3) If Ahs A = B G |k] then A^s A ^ B : k. 
li) If Ahs A ^ B : n then A^e A^ B G [n}. 

Lemma 21 (Closure under head expansion). 
{1) If M' and Ahs M' = N G [r] then Ah^ M = N G [rj. 

{2) If N' and Ahs M = N' G [rj then A\-s M = N G {t}. 

Lemma 22 (Logical relation symmetry). 

(1) If Ahs M ^ N G |t1 then A\~s N = M G |r]. 
{2) If A^E A = B G |k] then A^s B = AG 
(5) If A^s (J = e G |ei then A^s = a G {ej. 

Lemma 23 (Logical relation transitivity). 

Suppose that h S sig and h A sctx. 

(1) If Ahs M ^ N G |r] and A^s N = P G |t] then A^s M = P G [rJ. 

(2) If A^s A = B G [k] and A^s B = C G [k] then A^s A = C G {k}. 
(5) If A^E (T = 6 G |e] and A^s = 6 G {Q} then A^s (t = 5 G [6]. 

The proof that definitionally equal terms arc logically related required some care to 
formalize. The key step is showing that applying logically related substitutions to 
definitionally equal terms yields logically related terms. Establishing this (via the 
following lemma) required identifying and proving a number of standard properties 
of simultaneous substitutions. In contrast, reasoning about single substitutions 
sufficed almost everywhere else in the formalization. 

Lemma 24. Suppose h A sctx and A\-z <J = 6 G [-T"]. 
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{1} If r \-s M = N : A then A h^- M[a] = N[9] G I^"]. 
{2} If r \-s A = B : K then A h^- A[a] = B[e] e {K-j. 

The last step needed to establish completeness is to show that the identity substi- 
tution over a given context (written idp) is related to itself: 

Lemma 25. // \-s r ctx then idp ^ idp & \r~\- 

Theorem 6 (Definitionally equal terms are logically related). 
(J) If M = N ■ A then M = N & {A'}. 

{2) If F \-s A = B : K then F- \-^- A = B G {K-j. 

Corollary 1 (Completeness). 
(1) If F^s M = N : A then F- ^ ^- M ^ N : A~ . 
{2) If F^E A = B : K then F' h^- A B : R- . 
{3) If F^s K = L : kind then F- h^- K ^ L : kind'. 

Note that part 3 of Cor. 1 was omitted from HP05, but it is straightforward to 
prove by induction given parts 1 and 2, and algorithmic transitivity and symmetry. 

3.4 Soundness 

Soundness of algorithmic object (or type, or kind) equivalence means that if two 

well-formed objects (or type families, or kinds respectively) are algorithmically 
equivalent then they are also definitionally equivalent. For example, for objects, 
Thm. 2(1) states: 

If F- \-^- M ^ N : A- a.Tid F \-E M : A axid F 'rs N : A 
then F M = N : A. 

First, though, since the algorithmic judgments perform weak head reduction, we 
must show that weak head reduction preserves well-formedness: 

Lemma 26 (Subject reduction). Suppose M' andF \-s M : A. Then 

F M' : A and F \-E M = M' : A. 

Naturally, since algorithmic and structural equivalences for objects and types arc 
defined by simultaneous induction, we must also prove a simultaneous sound- 
ness property for the structural equivalence judgments. For example, to prove 
Thm. 2(1), we also need to show by simultaneous induction that: 

If r- h^- M o iV : T and r hi; M : ^ and r : 5 

then F '^s M = N : A and F '^s A = B : type and A' = t and 

B- = T. 

In contrast to completeness, the proof of soundness in IIP05 proceeds by entirely 
syntactic techniques, by induction over the structure of algorithmic and structural 
derivations, using standard syntactic properties and subject reduction. Our initial 
formalization attempt followed the proofs given by HP05. However, we encountered 
two difficulties which were not addressed in the article. Both difficulties have to 
do with algorithmic rules for checking equivalence at function types (or function 
kinds) using extensionality. In the rest of this section, we first discuss and address a 
minor difficulty involving extensionality in the proof of Thm. 2(1). We then discuss 
a more serious complication in proving soundness at the level of types, and show 
how to fix the problem. We conclude by summarizing the soundness results. 
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Soundness for algorithmic object equivalence. In proving the soundness of algo- 
rithmic extensionality for objects arising in part 1 of Thm. 2, recall that we have a 
derivation of the form: 

{x, Ti)::r- M X ^ N X : T2 x # {P- , M, N) 

M ^ N : Ti ^ T2 

and we also know that F M : A and F 'r^ N : A ior some A with A^ = ti 
— >■ T2. In order to apply the induction hypothesis, we need to know that M x and 
N X are well-formed in an extended context {x, Ai)::r . HPOS's proof begins by 
assuming that 7^ Y-^ M : Ila;:^!. A2 and F N : Ila;:^!. A2, and proceeding 
using inversion properties. However, it is not immediately clear that A~ = ri — >■ 
T2 implies that A = Ilx:Ai. A2 for some Ai and A2; indeed, this can fail to be the 
case if A is not well-formed. Instead, we first need the following inversion principles 
for erasure: 

Lemma 27 (Erasure inversion). 
(i) If r\-i: A: Ux-.B. K then ^c. A' = c". 
{2) If Ti — > T2 = A~ and F \-s A : type and x # A then 

3Ai A2. A = Ux-.Ai. A2. 
(5) If T ^ K = K- and X # K then 3A L. K = Ux:A. L. 

Proof. Part 1 follows by induction on the derivation. Parts 2 and 3 follow by 
induction on the structure of A and K respectively. In the case for type applications 
A M, clearly A has a Il-kind, but by part 1, A erases to a constant, contradicting 
the assumption that A~ = n — >^ T2. So the case is vacuous. The remaining cases 
of part 2 are straightforward, as are the cases for part 3. □ 

Using Lem. 27, we can complete the proof of the first part of Thm. 2 as described 
in HP05: 

Lemma 28 (Soundness of algorithmic object equivalence). 
(1) If F- ^jj- M ^ N : A- and F M : A and F N : A then 

F M = N : A. 
{2) If F- hjj- M ^ N : T and F M : A and F N : B then 

F \-E M = N : A and F hs A = B : type and A~ ^ t and B^ = r. 

Soundness for algorithmic type equivalence. The second problem we encountered 
arises in the proof of soundness for the extensionality rule in the algorithmic type 
equivalence judgment (part 3 of Thm. 2). In this case, we have a derivation of the 
form: 

{x, t)::F- \-s a X ^ B X : k x # {F' , A, B) 
r- A B : T ^ K 

We can easily show that the induction hypothesis applies, using the same tech- 
nique as above, ultimately deriving {x, Ay.-.F \-s A x = B x : K ior some j4' and 
K. However, we cannot complete the proof of this case in the same way as for 
object extensionality, because HP05's variant of LF does not include a type-level 
extensionality rule 
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{a, k) e E h E ssig h A sctx A A ^ B : t ^ k A M <^ N : t 

A hj: a a : K A\-sAMt^BN: k 

A ^1 ^ Bl : type- [x, A^-)::A hi; ^2^-62: type' x # (A, Ai, Bi) 
A \-s Ux-.A^. A2 ^ Ilx-.Bi. B2 : type~ 

Fig. 6. Weak algorithmic type equivalence judgment 

r^s A: Ux:C. K 
r^s B : m-.C. K r^s C : type [x, C)::r \-s A x = B x : K a;#r 

r'TE A = B : Iix:C. K 

that would permit us to conclude that F \-s A = B : \lx:A'. K. 

It was not immediately clear to us whether the original proof could be repaired. 
There appear to be several ways to fix this problem by changing the definitional or 
algorithmic rules. One way is simply to add the above extensionality rule for types 
to the definitional system. Using our formalization, we were easily able to verify 
that this solves the problem and does not introduce any new complications. For 
this we had to make sure that every proof done earlier is either not affected by this 
additional rule or can be extended to include it. 

A second solution, suggested by Harper^, is to observe that the original algorith- 
mic rules were unnecessarily general. In the absence of type-level A-abstraction, 
the weaker, syntax-directed type equivalence rules shown in Fig. 6 suffice. We can 
easily prove that these rules are sound with respect to definitional type equivalence: 

Lemma 29 (Soundness of weak type equivalence). 
If r- h^- A ^ B : K and r A : K and r B : L then F A = B : K, 
r \-z; K = L : kind, K~ = k and L~ = k. 

Proof. Similar to the proof of soundness of algorithmic and structural type 
equivalence from HP05. Requires soundness of object equivalence (Lcm. 28). □ 

With this change, we can prove completeness using a slightly modified logical 
relation: the type-level logical relation needs to be redefined as 

Ahs A = B G {kJ = Ahs A^ B : K. 

The first two solutions however establish soundness only for variants of the def- 
initions in IIP05. In particular, the first shows that the original algorithmic rules 
are sound with respect to a stronger notion of definitional equality, while the second 
gives a correct modified algorithm for the original definitional rules. Either solution 
appears reasonable, but neither tells us whether the original equivalence algorithm 
is sound with respect to the original definitional system in HP05. We felt it was 
important to determine whether or not a change to the definitions is truly necessary 
to recover soundness. In the rest of this section we show that the original results 
hold as stated. 



^personal communication 
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Since we already established that weak type equivalence implies definitional 
equivalence (for well- formed terms), it suffices to show that the original algorithmic 
type equivalence judgments imply weak type equivalence. To do so, we need to 
show that weak type equivalence admits extensionality (Lem. 34 below). This is 
nontrivial: we first need to develop some syntactic properties of algorithmic equiva- 
lence for objects, in particular that if A x <^ x : t then (.t, t) G A. This seems 
obvious, but the proof is slightly subtle because the algorithmic equivalence judg- 
ment is type-directed, not syntax-directed. Indeed, if we try to prove this directly 
by induction, then in the case where x has function type, the inductive hypothesis 
does not apply. 

Instead, we need to show something more general: for any term Mq of the form 

X yi ■ ■ • Un, if Mq is algorithmically equivalent to itself then every free variable of 
Mq appears in A with an appropriate type. We say that such an object Mq is an 
applied variable, defined formally as follows: 

Mo ::= X \ Mo X 

that is, it is a variable applied to a sequence of variables. Clearly, applied variables 
are weak head normal forms: 

Lemma 30. // Mq is an applied variable then Mq is in weak head normal form. 

We then introduce a weak well-formedness relation A ho Mq : r for applied vari- 
ables, defined as follows: 

{x, t) € A A 1-0 Mq : Ti ^ t2 (?;, ti) € A 
A ho a; : r A ho Mq y : ra 

It is easy to show that that ho satisfies strengthening: 

Lemma 31. // {y, t')::A ho Mq : t and y # Mq then A ho Mo : t. 

Furthermore, if an applied variable is algorithmically or structurally equivalent 
to itself, then it is weakly well-formed: 

Lemma 32. Suppose Mq is an applied variable and h A sctx. 
(1) If A\-s Ma <^ Mq : T then A ho Mq : r. 
{2) If A\-s Mq ^ Mq : T then A ho Mq : r. 

Proof. Induction on derivations. Lem. 30 is needed to show that the cases 
involving weak head reduction are vacuous. The only other interesting case is the 
case for an extensionality rule 

{x, ri)::A Mq x ^ Mq x : T2 x # (A, Mq, Mq) 
A hi; Mo Mo : Ti -J- r2 

By induction, we have that {x, ri)::A ho Mo x : T2- By inversion, we can show 
that {x, Ti)::A ho Mo : ri — > T2- To complete the proof, we use Lem. 31 to show 
that A ho Mo : Ti — >■ r2, which follows since x # Mo. □ 

Corollary 2. If A x x : t and h A sctx then {x, r) G A. 

We also need to establish strengthening for weak algorithmic type equivalence: 
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Lemma 33 (Strengthening of weak type equivalence). 
// A'@[{x, t)]@\\-e B : K and x # (A', A, B) then 
A'@A hi; ^ ^ B : K. 

Proof. Straightforward induction on derivations. Note that we need Lem. 19 
here in the case for structural equivalence of type applications. □ 

We now establish the admissibility of extensionality for weak type equivalence: 

Lemma 34 (Extensionality of weak type equivalence). 
If [x, r)::A \-s A x ^ B x : k and x # (A, A, B) and h A sctx then 
A ^ B : T ^ K. 

Proof. By inversion, we have subderivations {x, r);:A h^; A ^ B : t' ^ k and 
(x, t)::A \-s X x : t' for some r'. Using Cor. 2 on the second subderivation we 
have that {x, t') € {x, r)::A and using the validity of (x, t)::A we know that r = 
t'. Hence, (x, r)::A \-e A ^ B : t ^ k. Using Lem. 33 we conclude A hj; ^ ^ 
B : T ^ K. □ 

Lemma 35. Suppose h A sctx. Then: 
(1) If Ahs A 4^ B : K then A\-s A^ B : k. 
{2) If A\-E A ^ B : K then A^e A ^ B : k. 

Proof. By induction on the structure of derivations. The case for the algorith- 
mic type extensionality rule requires Lem. 34. □ 

The proof of Thm. 2 is completed as follows. 

Lemma 36 (Soundness of algorithmic type equivalence). 
(i) // r- A B : K- and r A : K and r hs B : K then 

r hs A = B : K. 
{2) If r- h^- A ^ B : K and r \-s A : K a.nd r B : L then 

r \-s A = B : K, r \-s K = L : kind, R- ^ k and L' = k. 

Proof. Immediate using Lem. 35 and 29. □ 

Lemma 37 (Soundness of algorithmic kind equivalence). 
If h^- K ^ L : kind~ and F K : kind and F \-s L : kind then F \-s 
K = L : kind. 

Proof. As in HP05, using Lem. 36 as necessary. □ 

Thm. 2 follows immediately from Lem. 28, 36 and 37. 

3.5 Algorithmic typechecking 

After the soundness and completeness proof, HP05 introduces an algorithmic ver- 
sion of the typechecking judgment, proves additional syntactic properties of def- 
initional equivalence, sketches proofs of decidability, and discusses quasicanonical 
forms and adequacy of LF encodings of object languages. We formalized many of 
these results and we will discuss them in the next few sections. 

The typechecking algorithm in HP05 traverses terms, types and kinds in a syntax- 
directed manner, using the algorithmic equivalence judgment in certain places. The 
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hs Q ^ ctx 


(x, A)::r cte 


r \-s M ^ A 





r ^ ctx [x, A) & r 

r'rs X ^ A 
\-s r ^ ctx (c, A) & E 

r^s c^ A 

r hi; Ml ^ nx:A2'. Ai r hs M2 ^ A2 h^- A2 ^ A2' ■. type' X # r 

r \-E Ml M2 Ai[x:=M2] 
r Ai ^ type {x, Ai)::r M2 => ^2 x # (F, Ai) 
r \-s \x:Ai. M2 ^ Ux-.Ai. A2 

r\-s A^ K 

hi; r ^ ctx (a, K) e E 
r\-s a ^ K 

r hi; A ^ Ux:A2'. Ki r \-s M ^ A2 T" h^_ A2 <^ A2' : type~ x # F 
r \-s A M ^ _ft'i[a;;=M] 
r hs Ai ^ type (x, Aiy.-.F \-s A2 =^ type x # (r, Ai) 
r hjj Ux-.Ai. A2 =^ type 

r K ^ kind 

\-s r ^ ctx r \-s A ^ type {x, A)::r hs K ^ kind x # (r, A) 

r \-s type =^ kind F \-s lix-.A. K =^ kind 

Fig. 7. Algorithmic typechecking rules 

definition of algoritlimic typccliecking in HP05 omitted explicit definitions of algo- 
ritfiniic signature and context validity. In our formalization, we added these (obvi- 
ous) rules, as shown in Fig. 7. The remaining rules are the same as in HP05 except 
for a trivial typographical error in the riile for type constants. Proving the sound- 
ness and completeness of algorithmic typechecking is a (mostly) straightforward 
exercise using soundness and completeness of algorithmic equivalence and various 
syntactic properties: 

Theorem 7 (Soundness of algorithmic typechecking). 
{1) If \- S ^ sig then h S sig. 
(2) If \-E r ^ ctx then \-e P ctx. 
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(5) If r \-s M ^ A then F M : A. 

{4) If r\-s A^ K then F^e A: K. 

(5) If r\-s K ^ kind then F \-s K : kind. 

Theorem 8 (Completeness of algorithmic typechecking) . 
{!) If \- S sig then \- S ^ sig. 
{2) If \-E r ctx then F ^ ctx. 

(3) If r^s M : A then BA'.T^sM^ A' and F A = A' : type. 
{4) If r\-s A: K then 3K'. T A ^ K' and F^s K = K' : kind. 
(5) If r\-s K : kind then r\-s K ^ kind. 

3.6 Strengthening and strong extensionality 

The strengthening property states that all of the definitional judgments are pre- 
served by removing an unused variable from the context. We already established 
strengthening for the algorithmic equivalence judgments (Lem. 19). In order to 
establish strengthening for the algorithmic typechecking judgments, we need a 
stronger freshness lemma for algorithmic typechecking, which was not discussed 
in HP05: 

Lemma 38 (Strong algorithmic freshness). Let r = Fi @[{x, B)] @r2. 
{!) If r M ^ A and X # {Fi, M) then x # A. 
{2) If r \-E A ^ K and x # (Fi, A) then x # K. 

We can now prove strengthening for algorithmic typechecking by induction on 
derivations: 

Theorem 9 (Strengthening of algorithmic typechecking). 

Let r = Ti @[{x, B)] @r2. 

(1) If \-s r ^ ctx and a; # Ti then Ti ^ ctx. 
I2) If rhs K ^ kind and x # (Fi, K) then Ti ©Ta hi; if ^ kind. 
(3) If r A =^ K and X # {Fi, A) then Ti ©Ta A K. 
i4) If r \-s M ^ A and X # {Fi, M) then Ti ©Ta \-e M ^ A. 

Proof. The proof is straightforward, using strengthening for algorithmic equiv- 
alence; parts (1-4) need to be proved in the order stated above since we need 
strengthening for contexts everywhere, we need strengthening for kinds to prove 
strengthening for types, and so on. Lem. 38 is needed in the cases for object and 
type application. □ 

Finally, we can prove strengthening for the definitional system. 

Theorem 10 (Strengthening). Let F = Fi @[{x, B)] @F2. 
(1) If 'rs F ctx and x =ff^ Fi then \-s i^i SF2 ctx. 
{2) If Fhs K : kind and x # {Fi, K) then Fi @F2 K : kind. 

(3) If F \-E K = L : kind and x # {Fi, K, L) then Fi @F2 \-e K = L : kind. 

(4) If F A : K and x # {Fi, A) then Fi @F2 A : K. 

(5) If F hE A = B : K and X # (A, A, B) then Fi @F2 ^e A = B : K. 
{6) If F \-E M : A and X # (A, M) then A @ A \-e M : A. 

(7) If F \-E M = N : A and X # {Fi, M, N) then Fi @ A ^e M = N : A. 
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Proof. The proof follows the sketch in the article, using algorithmic strength- 
ening and soundness and completeness of the algorithmic judgments, but some care 
is needed. Part 1 is straightforward, but we must prove the remaining cases in the 
specific order listed: first kind validity, then kind equivalence, then type validity, 
etc. The reason is that to prove strengthening for the equivalence judgments, we 
need strengthening for the corresponding validity judgments because of the valid- 
ity side-conditions on Thm. 2. In turn, to prove strengthening for the object and 
type validity judgments, we need strengthening for type and kind equivalence re- 
spectively, because of the respective type and kind equivalence judgments in the 
conclusions of Thm. 8. Lcm. 38 is needed in parts (4) and (6). □ 

HP05 also sketched a proof of admissibility of a stronger version of the exten- 
sionality rule which omits the well-formedness checks: 

{x, Ai)::r Mx = Nx : A2 x {M, N) 
r M = N : Ux:Ai. A2 

However, the short proof sketched in the article actually requires a substantial 
amount of work to formalize. The first two steps of their informal proof were as 

follows: 

(1) By validity, we have {x, Ai)::r \-s M x : A^. 

(2) By inversion, we have (x, A\)::r M : Hx-.Bi. B2 and {x, Ai)::r x : 
Bi. 

However, step (2) above does not follow immediately from the inversion lemmas 
proved earlier. In particular, we only know that M will have a type of the form 
Hy.Bi. B2 for some y, Bi and B2 such that {x, Ai)::r \-s M : Hy.Bi. B2 and 
{x, Ai)::r y : Bi and {x, Ai)::r A2 = B2[y:=x] : type. Moreover, in this 
case we cannot use the strong version of the inversion lemma to avoid this problem, 
because x is already in use in the context. 

Although their proof looks rigorous and detailed, here Harper and Pfenning ap- 
pear to employ implicit "without loss of generality" reasoning about inversion and 
renaming that is not easy to formalize directly. Instead we needed to show carefully 
that: 

Lemma 39. // (x, Ai)::r \-s M x : A2 and x # M then F M : Wx-.Ai. A2. 

Proof. The proof proceeds by applying validity and inversion principles, as 
discussed above. One subtle freshness side-condition is the fact that x is fresh 
for liy.Bi. B2, and this is proved by translating to the algorithmic typechecking 

system and using Lem. 38. □ 

Strong extensionality then follows essentially as in HP05, using Lem. 39 to fill the 
gap identified above: 

Theorem 11 (Strong extensionality). 
// {x, Ai)::r \-E M X = N X : A2 and X # [m, N) then 
r\-E M = N : Ux:Ai. A2. 

3.7 Decidability 

HP05 also sketches proofs of the decidability of the algorithmic judgments (and 
hence also the definitional system). Reasoning about decidability within Isabelle/HOL 
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is not straightforward because Isabelle/HOL is based on classical logic. Thus, un- 
like constructive logics or type theories, we cannot infer decidability of P simply 
by proving P W ^ P. Furthermore, given a relation R definable in Isabclle/HOL, it 
is not clear how best to formalize the informal statement "i? is decidable" . 

As a sanity check, we have shown that weak head reduction is strongly normaliz- 
ing for well-formed terms. Wc write MJj. to indicate that M is strongly normalizing 
under weak head reduction. This proof uses techniques and definitions from the ex- 
ample formalization of strong normalization for the simply-typed lambda calculus 
in the Nominal Datatype Package. 

Theorem 12. If P \-s M : A then Mi}.. 

Proof. Wc first show the (standard) property that if M then Mi}.. We then 
show that if A hj; M N : t then MJJ- by induction on derivations. The main 
result follows by reflexivity and Thm. 1. □ 

Turning now to the issue of formalizing decidability properties in Isabelle/HOL, 
we considered the following options: 

Formalizing computability theory. It should be possible to define Turing machines 
(or some other universal model of computation) within Isabelle/HOL and derive 
enough of the theory of computation to be able to prove that the algorithmic 
equivalence and typechecking relations are decidable. It appears to be an open 
question how to formalize proofs of decidability in Isabelle/HOL, especially for 
algorithms over complex data structures such as nominal datatypes. Although 
this would probably be the most satisfying solution, it would also require a major 
additional formalization efi'ort, including a great deal of work that is orthogonal to 
the issues addressed here. Another possibility would be to restrict Isabelle/HOL 
to a constructive fragment, but this seems even more difficult and time-consuming 
since Isabelle/HOL makes extensive use of choice principles and the law of excluded 
middle. We therefore view fully formalizing decidability in this way as beyond the 
scope of this article. Instead, we consider other techniques that stop short of full 
formalization while providing some convincing evidence for decidability. 

Bounded- height derivations. We could define height-bounded versions of the al- 
gorithmic typechecking relations and prove that there is a computable bound on 
the height needed to derive any derivable judgment in the system. That is, there 
exists a computable h such that for any inputs Xi,. . . ,Xn, there is a derivation of 
J{xi, . . . , Xn) if and only if there is a derivation of height at most h{xi, . . . , x„). 

This seems reasonable intuitively, but there are several problems. First, it is 
not obvious how to obtain a closed-form, recursively defined height bound for the 
number of steps needed for algorithmic equivalence for the same reason it is difficult 
to give an explicit termination measure for weak head normalization. Second, 
even if we could find such an h, this approach begs the question of how to prove 
that h is computable. It is clearly not enough to simply require that some h 
exists, because the Axiom of Choice can be used to define h nonconstructively. 
Finally, inductively defined judgments in IsabcUc/HOL may themselves involve 
nonconstructive features, including equality at or quantification over infinite types, 
negation of undecidable properties, and choice operators. Although the definitions 
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we have in mind do not use these facihties, there is no easy way to certify this 
within Isabelle/HOL. 

Inductive definability. We have formahzed what we beheve is the essence of the 
decidability proof using the following methodology. For each inductively defined 
relation R we wish to prove decidable, possibly under some constraints P: 

(1) Inductively define a complement relation R'. 

(2) (Exclusion) Prove that -i {R and R'). 

(3) (Exhaustion) Prove that P implies R \/ R'. 

(4) Observe (informally) that R and R' are recursively enumerable since they are 

defined inductively by rules without recourse to nonconstructive features such 
as negation or universal quantification in the hypotheses. Conclude (informally) 
that P implies R is both r.e. and co-r.e., hence decidable. 
This approach exploits an intuitive connection between inductively definable 
predicates and recursively enumerable sets in step (4). It is important to note 
that this intuition is not rigorously formalized. We argue that this approach does 
force us to perform all of the case analysis that would be necessary in a proper 
decidability proof, but the only way to be certain of this is to fully formalize a 
substantial amount of computability theory in Isabelle/HOL, which as we have 
discussed above would be a major undertaking in its own right. 

We call a formula R quasidecidable if both R and its negation are equivalent to 
inductively defined relations, as described above. This is an informal (and inten- 
sional) property; we have not defined quasidecidability explicitly in Isabelle/HOL. 
We have the following lemma, analogous to HP05's Lemma 6.1: 

Theorem 13 (Quasidecidability of algorithmic equivalence). 
{!) If A M M' : T and A N ^ N' : T then A^e M ^ N : t is 
quasidecidable. 

(2) If A^E M ^ M' : Ti and A^e N ^ N' : T2 then Bt^. A^e M ^ N : 
Ts is quasidecidable. 

(5) If A \-E A <^ A' : K and A \-E B <^ B' : K then A \-e A ^ B : k is 
quasidecidable. 

(4) If A \-E A -ir^ A' : Ki and A'^e B ^ B' : then Bks. A hj; ^ -H- B : K3 

is quasidecidable. 

(5) If A \-E K ^ K' : kind- and A \-e L ^ L' : kind" then A \-e K ^ L : 
kind~ is quasidecidable. 

We further proved that the algorithmic typechecking judgments are quasidecid- 
able, which is the key step in HP05's Theorem 6.5. Proving exclusivity required 
establishing uniqueness of algorithmic typechecking. 

Lemma 40 (Uniqueness of algorithmic types). 

(1) If r \-E M ^ A and r ^E M ^ A' then A = A'. 

(2) If r \-E A ^ K and r ^E A ^ K' then K = K'. 

Equipped with Thm. 13 and the uniqueness lemma above, we can show a form 
of HP05's Theorem 6.2. Note that uses of Thm. 13 are safe because we always call 
the algorithmic equivalence judgments on terms that are well-formed, and hence 
(by Thm. 2) algorithmically equivalent to themselves. 
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A\-E M N : T is O 

M^M' Ahs M' ^ N : a- -it 5 N^N' A M <S> W : a" -ff (5 

A \-s M <^ N : a- f 6 A M N : a' f 6 

A\-s M N : a- id {x, t)::A \-s M x ^ N x : t' ■{[ O x#{A,M,N) 
A\-s M <^ N : a- O A \-s M ^ N : t t' ■[[ Xx.O 

A\-s M -H- N : T i O 

(x, r) t A h X KHKj h A sctx (c, k) ^ U h U .•isiij h A sctx 
A X -n- x : T X X A c -H- c : fc ^ c 

A \-s Ml ^ Ni : T2 ^ Ti I Oi A hs M2 ^ N2 ■■ T2 it O2 
A hi; Ml M2 ^ Ni N2 ■■ Ti i Oi O2 

Fig. 8. Algorithmic equivalence rules instrumented to produce quasicanonical forms. 

Theorem 14 (Quasidecidability of algorithmic typechecking) . 

(1) For any E , \- S sig is quasidecidable. 

{2) For any S,r, if \- E ^ sig holds then \-s F ^ ctx is quasidecidable. 

(3) For any E,r,M, if \-e T ^ ctx holds then 3A. r\-sM^Ais quasidecidable. 

(^) For any E,r,A, if \-s r => ctx holds then 3 K. F \~s A ^ K is quasidecidable. 
(5) For any E,F,K, if \-s F ctx holds then F \-s K kind is quasidecidable. 

3.8 Quasicanonical forms 

Section 7 of HP05 discusses quasicanonical forms which can be used to study the ad- 
equacy, or correctness, of LF encodings. Quasicanonical forms are untyped A-terms 

that correspond to the /?-normal, ?7-long forms of wcU-typcd LF terms. Quasicanon- 
ical forms O and quasiatomic forms O are given by the grammar rules: 

(5 O I Xx.O O -.-.^ x\c\0 6 

HP05 introduces instrumented algorithmic equivalence judgments that construct 
quasicanonical forms for algorithmically and structurally equivalent terms, respec- 
tively. The rules arc shown in Fig. 8. 

It is straightforward to show that quasi-canonical and quasi-atomic forms exist 
and are unique (provided that E and A are valid). 

Lemma 41 (Properties of quasicanonical forms). 

{1) If A^s M ^ N : T then 3 QC. A M ^ N : t QC. 
{2) If A^s M -n- N : T then 3 QA. A hs M ^ N : t i QA. 
(3) If A^s M ^ N : T t O then A^e M ^ N : r. 
{4 ) If A^E M ^ N : T i O then A hs M ^ N : t. 

(5) If A^E M ^ N : T it 6 and M' then A ^e M ' ^ N : t if 5. 

(6) If A^E M <^ N : T it 6 and N' then A ^e M ^ N' : t t 0. 

Theorem 15 (Uniqueness of quasicanonical forms). 
(F) If \- A sctx and h E ssig and A \-e M ^ N : t it 5i and A hi: M <S4> AT 
■.T it 02 then Oi = O2. 
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(2) If \- A sctx and h S ssig _and A hi; M o iV : r ^ Oi and A hi; M o iV : 
r' 4, O2 then t = t' and Oi = O2. 

Proof. By induction on derivations, using Lem. 41(5,6) in the cases involving 
weak head reduction. □ 

The main result about these forms in HP05 is that wcll-formcd LF terms can be 
recovejed from quasicanonical forms and type information. To show this, we write 
-AT -f|- O or TV 4, O for the relations that relate objects N with their quasicanonical 
forms O or quasiatomic forms O, respectively, where the type-labels have been 
erased. (HP05 defined this notion as a partial function, which would be difficult to 
define with the Nominal Datatype Package at the time of writing.) These relations 
are defined as follows: 

M id N i[ O M ii 6 M iO 

xix cic {MN)iOd {Xx:A. M) it Xx.d MHO 

In the proof of the Quasicanonical Forms theorem (Theorem 7.1 of HP05) we 
foimd it necessary to prove several nontrivial auxiliary lemmas such as the admis- 
sibility of ?7-equivalence (which was not discussed in HP05): 

Lemma 42 (Eta-equivalence). If x # F and F M : 'Hx-.A^. A2 then F 
hi; M = \x:Ai. M X : Ux:Ai. A2. 

The following theorem is stated slightly differently than the corresponding theo- 
rem in HP05 (Theorem 7.1), but their version follows immediately from this version. 

Theorem 16 (Quasicanonical forms). 

{!) If F- h^-_Mi ^ M2 : A- it 6 and F \-E Ml : A and F M2 : A then 
3N. N it O and F N : A and r 'rs Ml = N : A and r M2 = N : A. 

{2) If F- hjj- Ml ^ M2 : T i O and F Mi : Ai and F hi; M2_ : A2 
then F Ai = A2 : type and Ai~ — r and A2~ = r and {3N. N I O and 
F N : Ai and F Ml = N : Ai and F \-e M2 = N : A2). 

3.9 Adequacy 

Conventionally, adequacy is the property that the terms of the object language 
are in a bijective correspondence with the well-formed LF terms of a given type, 
modulo LF equality. Moreover, the bijection should be compositional in the sense 
that substitution for the object language is preserved and reflected by substitution 
in LF. The exact statement of the adequacy theorem for a given language depends 
on the language and its definition of substitution. To illustrate how quasicanonical 
forms could be used for reasoning about adequacy, HP05 introduces a small example 
language of first-order terms t and formulas (p, similar to the following: 

t,u ::= X I f{t,u) (p,ip '■■= t = u\ (pA'ip \ Vx-cp 

along with an appropriate LF signature Spo with types l for first-order terms, o 
for first-order formulas, and constants 



■^This term is used in HP05 witliout being defined, but this is the definition used in other articles 
which discuss adequacy, for example [Harper et al. 1993; Pfenning 2001]. 
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r h ti Ml : /, r h t2 <™* M2 : (- 








_r h a; <™» x : t 


-T h f(ti,t2) ^ Cf Ml M2 •■ i 


r h v3 









r \- ti ^ Ml : t r h t2 <™> : t r h (^1 «™» J^i ; o T h 1^2 <™* -^^2 : o 
r \- ti = t2 c= Ml M2 ■ o r h (pi A ip2 ^ C/\ Ml M2 : o 

{x, t)::r h (fi ^ M : o x # T 
r h \/ x.(p cv \x.M : o 

Fig. 9. Adequacy translation 
Cf : <-—>■<.—>■(- C= : L ^ i ^ 

C/\ : ^ ^ Cv : (t —> o) — >• . 

HP05 then defines translation judgments F \- t <^ M : l and F \- tp «^ M : o 
relating LF terms M with first-order terms and formulas t : l and ip : o. Note 
that unlike most other judgments in this article, the translations are not implicitly 
parametrized by a signature S since they only refer to constants from the fixed 
signature Sfo- The rules for the translation are shown in Fig. 9. 

Harper and Pfenning then formulate the adequacy property for this language in 
their Theorem 7.2 as follows: 

Theorem 17 (Adequacy for syntax of first-order logic). Let F he a 
context of the form xi :(.,... ,a;„ : t for some n > 0. 

(i) The relation F \- t M : l is a compositional bijection between term,s t of 
first-order logic over variables Xi,. . . ,Xn and quasi- canonical forms M of type 
L relative to F. 

{2) The relation F \- ip <^ M : o is a compositional bijection between formulas 
t of first-order logic over variables Xi,...,Xn and quasi- canonical forms M of 
type o relative to F. 

Their proof sketch involves first showing that (for all appropriate F) the trans- 
lations are bijections, and then proving compositionality by induction over the 
structure of terms and formulas. 

Unfortunately, the statement of this theorem is ambiguous or at least incom- 
plete. The reason is that Harper and Pfenning do not explicitly define what it 
means for a bijection to be compositional. Even assuming the standard definition 
of compositionality as substitution preservation, HP05 did not provide a definition 
of substitution for quasicanonical forms. 

If we wish to substitute a quasicanonical form for a variable y in another quasi- 
canonical form, the result is not always quasicanonical. For example, if we substi- 
tute Xx.M for y iny N, we get (Xx.M) N, which is not quasicanonical. This illus- 
trates that quasicanonical forms are not closed under substitution of quasicanonical 
forms for variables, because variables are quasiatomic forms and substituting a A- 
expression for a variable may introduce /3-redexes. 
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It has been observed elsewhere (apparently first by Watkins et al. [2003]) that 
substitution can be defined for well-formed quasicanonical expressions in a hered- 
itary way that recursively rcnormalizcs any /3-rcdexcs introduced by substitution. 
Harper and Licata [2007] have shown how this idea can be used as the basis for 
a variant of LF called Canonical LF in which all expressions are maintained in 
canonical form. 

In our initial formalization (reported in [Urban et al. 2008]) we misinterpreted 
the definition of the translation slightly by defining the adequacy translations to 
relate first-order terms and formulas to quasiatomic forms. It is easy to define 
substitution of quasiatomic forms for variables since no reduction can be introduced 
in doing so. Consequently, we proved a variant of HPOS's Theorem 7.2 with the word 
"quasicanonical" replaced by "quasiatomic" . However, even with this modification, 
the formal proof is not as easy as the sketch in HP05 suggests; for example, we 
needed to prove weakening, exchange, and substitution lemmas for the translation 
judgment in order to establish compositionality. 

After we discovered and corrected the mismatch between our definition and the 
original translation, we were still able to prove that the translations are bijections. 
To establish compositionality, we also formalized hereditary substitution (using a 
simple form of Harper and Licata's definition) and showed that the translation 
maps object- language substitution to hereditary substitution. 

Formalizing HPOS's Theorem 7.2 thus appears to require either changing their 
translation or introducing hereditary substitution, a nontrivial concept that was not 
mentioned in HP05. The Canonical LF approach now appears to be the preferred 
starting point for research on extensions to LF. Developing a full and satisfying 
formalization of hereditary substitutions and adequacy properties (and relating 
HP05's version of LF to Harper and Licata's development of Canonical LF [2007]) 
would be a significant independent undertaking. Therefore, we prefer to leave 
further study of adequacy based on hereditary substitution for future work. 

4. CODE GENERATION 

Since type checking in LF can be part of the trusted code base of proof-carrying 
code, Appel et al. [2003] were very careful to implement it as cleanly as possible 
and in as few lines of code as possible. Their motivation was that a small and clean 
implementation can be manually inspected and hence can be made robust against, 
for example, Thompson-style attacks [Thompson 1984]. For this they explicitly set 
out to minimize the number of library fimctions they have to trust in order for 
their implementation to be correct. However, they relied upon the correctness of 
the type-checking algorithm in HP05. 

In this paper we have formally proved that both the equivalence checking and 
type-checking algorithms from HP05 are sound and complete. Consequently, we 
can remove this aspect from our "trusted code base" . In this section we show how 
to obtain a verified executable ML-implementation of the type-checking algorithm 
from our proof of correctness. 

Isabelle/HOL contains a code generator implemented by Berghofer and Nipkow 
[2002] which can translate inductive definitions into executable pure ML-code au- 
tomatically. To be able to use this code generator, however, we need to invest 
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some further work. The present version of this code generator can only deal with 
rules involving datatypes, not nominal datatypes. To surmount this problem we 
translate our nominal representation of kinds, types and terms into a locally name- 
less representation [McKinna and Pollack 1999; Aydemir et al. 2008], which can be 
implemented as an ordinary Isabelle/HOL datatype. For the LF-syntax this gives 
rise to the definition: 

Locally Nameless Kinds ::= type \ UA. K 
Locally Nameless Types ::= a \ liAi. A2 \ A M 
Locally Nameless Objects ::= c | a; | n | A^. M \ Mi M2 

where terms contain de Bruijn indices n for bound variables [dc Bruijn 1972]. In 
comparison with "pure" de Bruijn representations, in the locally nameless repre- 
sentation free variables still have names. This means we can continue using our 
implementation of signatures and contexts in judgments. With a "pure" dc Bruijn 
representation, contexts would need to be referenced by numbers and positions. 

While the locally nameless representation is straightforward to implement in 
IsabcUc/HOL, the translations bcitwccm the nominal and locally nameless represen- 
tation involve quite a lot of formalisation work. First we have to define a well- 
formedness predicate that ensures that there are no loose de Bruijn indices. We 
also need three substitution operations, namely substituting (well-formed) terms 
for free variables, written {—)[x := M], substituting terms for de Bruijn indices, 
written (— )[n M], and substituting de Bruijn indices for variables, written {—)[x 
:= n]. In the latter we have to increase the de Bruijn index whenever the substi- 
tution moves under a binder. Also the translation functions between the nominal 
and locally nameless representations are non-trivial to define. In one direction the 
translation is a partial function and only total over well-formed locally nameless 
terms. In the other direction we use a translation depending on an explicit list of 
variables. The idea is to push a variable onto the list whenever the translation goes 
under a A- or a Il-abstraction. Now the de Bruijn index for a variable occurrence 
is the position of the variable in this list. The translation, written | — |a;s, can be 
formally defined as 



\type\xs 


= type 








\Ilx:A. K\,, 


= 3;s. l-f!' (x::xs) 


provided x 




xs 


\a\xs 
\A MU 


= a 








Ila;:^!. ^2|xs 


= n|^i|a;s. ^2 (x::xs) 


provided x 




xs 


1 C |a:s 


= C 

= index x xs 








\MN\,, 


= \M\^, \N\^s 








\\x:A. M\^s 


= X\A\^s- \M\^^.,,^^-) 


provided x 




xs 



where the variable case is defined in terms of the auxiliary function index x xs n: 

index x ^ n = x 

index x {y::ys) n = {if x = y then n else index x ys {Sue n)) 

The problem with this definition arises from the fact that inductions need to be 
appropriately generalised in order to take the potentially growing list of variables 
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into account. This is sometimes easy to do, but sometimes we needed a lot of 
ingenuity to find the right lemmas to get inductions through. 

Having translated all our terms into the locally nameless representation, we solved 
the technical problem with the code generator in Isabelle/HOL. However, there is 
a further problem that needs to solved: the algorithms specified so far are not yet 
concrete enough to be translated directly into runnable ML-code. For this consider 
again the algorithmic equivalence rule 

(x, Ti)::A hs M X N X : T2 x # {A, M , N) 
A \-s M <^ N : Ti ^ T2 

from Fig. 4. This rule decides the equivalence between the terms M and N having 
function type. When read bottom-up, it states that wc need to introduce a variable 
X (any will do) that is fresh for A, M and N. ML does not have any built-in facilities 
for choosing such a fresh name (unlike, for example, FreshML by Shinwell et al. 
[2003]). This means for an ML-implemcntation of type and equivalence checking 
that we need to make explicit which fresh name should be chosen. An obvious choice 
is to inspect all free variables occurring in A, M and N, and produce a variable 
with a higher index. In our case, it suffices to compute the maximum index of all 
variables in scope and increase by one to obtain a fresh variable index. We are 
able to compute this index because names in the Nominal Datatype Package have 
a natural number as index and thus can be ordered. This allows us to formulate 
algorithmic equivalence riilcs as follows 

{x, ri)::A in\-s M x 4^ N x : T2 x = maxi {fv A @fv M@fv N) 

A in\-S M <^ N : Ti ^ T2 

where fv is a polymorphic function producing a list of free variables of a term or 
context, and the function maxi scans through a list of variables and returns the 
highest variable increased by one. 

In Fig. 10 we show the rules for type checking in the locally nameless represen- 
tation and with the explicit choice of fresh variables. The locally nameless variants 
of these judgments are marked by the subscript In. We omit the locally nameless 
versions of the algorithmic equivalence rules but they are similar. The functions 
fi (— ) and fv (— ) calculate the free identifiers and free variables of their arguments, 
respectively. 

It is important to note that it would be extremely inconvenient to build the 
concrete choice for a fresh variable; into the rules that are used in the soundness 
and completeness proofs described in the earlier sections. The reason is that several 
of the proofs would not go through as stated in HP05 since the choice is not fresh 
enough for all entities considered in some lemmas (an example is the weakening 
property, where the variable x is assumed to be not just fresh for A, M and N, 
but also for a larger context A'). It is however relatively straightforward to show 
the equivalence (i.e., they derive the same judgments, modulo translation) between 
the original rules and the rules with the concrete choice for fresh variables. We can 
show: 

Lemma 43 (Equivalence). 
{1) \- S =^ sig if and only if => sig. 
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S sig 

(nh S ^ sig [] in\-u A ^ iype c ^ fi E 
In^ [] ^ sis !nl~ (c, j4)::i: =^ sig 

j„l- E =^ sip [] ;„h^ K fcind a ^ fi S 
i„l- (a, if);;!: ^ sig 

inl" ^ ^ ««3 inl-i; r ^ ctx r A ^ type x ^ fv T 

in^s [] ctx i„\-s {x, A)::r => ctx 

r M ^ A 

in'<-E r ^ ctx (x, A) G r r ^ ctx (c, ^) e i: 

ln'<-S X ^ A r in'rs C ^ A 

r in^s Ml => n^z'- Al r i^^hs Mz ^ A2 r- ;„h^- A2 <^ ^12' : %pe~ 
r i„\-s Ml M2 ^ Ai[0 := M2] 
r in'^E Al type 

(x, ^i)::r in'rs M2IO := x] ^ A2 X = maxi (Jv F @fv M2 @fv Ai) A2 = A2IX := 0] 

r xAi. M2 ^ n^i. A2' 

r i„hs A ^IF 

In^s r ^ ctx {a, K) & E 
r a ^ K 

r in'^s A ^UA2'. Ki r M ^ A2 r- A2 <^ A2' : type- 

r ir,\-s AM ^ Ki[0 := M] 
r i„\-s Al ^ type {x, Ai)::r i^hs A2[0 := x] =^ type x = maxi (fv F @fv Ai @fv A2) 

r in'TE n^i. A2 => type 

r i„\-E K => kind 

In^S r ^ ctx 

r i^yE type kind 

r in'^s A =^ type {x, Ay.-.F K[0 := x] ^ kind x = maxi {fv F @fv A @fv K) 

r in^E n^. K =^ kind 

Fig. 10. Algorithmic typechecking rules used for generating executable code. 

{2) \-s r ^ ctx if and only if |r|[] ^ ctx. 

{3) r^s M ^ A if and only if ,„h|^|j, |M|[] ^ \A\^y 

(4) r^s A=^ K if and only if |r|[] ;„h|£ijj |^|[] ^ \K\^. 

(5) r K =^ kind if and only if \r\^ in^\E\ii |-^|[] =^ kind. 

From the rules in Fig. 10 the eode generator of Isabelle/HOL can generate ML-code. 
Of course the correctness of this code depends on the correctness of the generator. 
However it is relatively easy to inspect the generated ML-code and we are confident 
that it implements correctly the inductive definitions that have been proved to be 
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sound and complete with respect to their specification. We have used the extracted 
ML-code to type-check several LF example signatures. 

5. DISCUSSION 

It is difficult to argue objectively about the efficacy or usability of tools for mech- 
anized metatheory about languages with name-binding, since there are substantial 
differences among systems, there arc few experts in the use of more than one sys- 
tem, and each such formalization is a major undertaking. Nevertheless, we believe 
it is worthwhile to make some subjective observations about our experience formal- 
izing LF using Nominal Isabelle/HOL, and identify aspects of the two systems that 
aided or hindered formalization. 

Methodological observations. The formalization was performed by two of the au- 
thors; one is a developer of the Nominal Datatype Package and expert Isabelle/HOL 
user and the other had roughly three months' experience with these tools prior to 
starting the formalization. We estimate that the total effort involved in conducting 
the formalizations in Sec. 3 was at most three person-months. We worked on the 
code generation part intermittently and therefore do not have detailed information 
about the time required. Although there is still room for improvement in both 
Isabelle/HOL and the Nominal Datatype Package, our experience suggests that 
these tools can now be used to perform significant formalizations within reasonable 
time-frames, at least by experienced users. 

It took approximately six person-weeks to formalize everything up to the sound- 
ness proof (including pondering why the omitted case for type cxtcnsionality did 
not go through). However, once Harper and Pfenning conffimed that this case 
was indeed not handled correctly in their proof, one of the authors was able to 
check within 2 hours that adding a type-extensionality rule solves the problem. Re- 
checking the proof on paper would have meant reviewing approximately 31 pages 
of proofs. Subsequently we checked the validity of a solution suggested by Harper 
and found another solution for the problem. As a practical matter, the ability to 
rapidly evaluate the effects of changes to the system was essential for finding these 
solutions and evaluating other possibilities. In a similar formalization project, the 
first author showed that a central lemma in the informal proof in his PhD-thesis 
can be repaired [Urban and Zhu 2008]. 

Comparing the formalization and informal proof. In our formalization, we at- 
tempted to follow the syntax, definitions and proofs given in HP05 as closely as we 
could, and resisted the temptation to change their rules to make our task easier. 
We found that nominal techniques were usually able to state results almost exactly 
as they are presented on paper; the main differences tended to involve freshness 
or validity side-conditions that were left implicit in HP05. To illustrate this point, 
we have prepared this paper using Isabelle's documentation facilities [Nipkow et al. 
2002]. Most lemmas, theorems, and definitions in this paper have been generated 
directly from the formalization (the main exceptions are the quasidecidability and 
adequacy properties, which are paraphrased). 

In this article, we have focused on the high-level ideas of the formalization and, 
in the main, downplayed the low-level details of proofs using the Nominal Datatype 
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Package. This is not because these details are embarrassing, but because (with a 
few clearly-marked exceptions) they are prosaic. For example, as one would expect, 
our formalization also required formally proving many properties of substitution, 
swapping, freshness, contexts, erasure and so on. We have not discussed these 
because they are routine, and rely upon techniques already covered in previous 
work on nominal tcchniqiics [Urban 2008]. However, wc would like to point out 
here that the capability to define functions such as substitution and erasure as 
(nominal) primitive recursive functions, and use Isabelle/HOL's built-in simplifier 
to rewrite formulas involving these functions, was absolutely essential. This is not to 
say that these proofs were always easy, but that difficulties involving name-binding 
were usually not the dominant factor. 

We have not explicitly stated when features such as strong nominal induction 
or inversion principles [Urban et al. 2007; Berghofer and Urban 2008] have been 
used (nor have we given complete explanations of these techniques), but our for- 
malization relies upon them extensively. When these principles could be used, the 
paper proof was usually easy to translate to a formal proof step-by-step (although 
as is often the case with formalizations, an informal proof step often translated to 
many formal steps or necessitated additional lemmas). On the other hand, in a few 
cases nominal induction principles could not be applied, often because of subtleties 
involving binding. When this was the case, proof cases involving binding were often 
much more labor-intensive because they required explicit reasoning about choosing 
fresh names, alpha-equivalence, swapping and substitution (see, for example, the 
proof of transitivity of algorithmic equivalence). 

Many of our proofs have been be written to match corresponding informal proofs 
closely using the Isar proof-language, as in an example by Urban [2008, Sec. 6] 
of a typical substitution property. However, writing readable Isar proofs is labor- 
intensive; the proof-script tactic language of classic Isabelle/HOL tends to be much 
easier to write but harder to read. 

The interested, or skeptical, reader is welcome to consult the formalization for 
these details, replay the proofs of key properties, compare them with those in HP05, 
and form his or her own opinion. 

Metrics about the formalization. In Table I, we report some simple metrics about 
our formalization such as the sizes, number of lines of text, and number of lem- 
mas in each theory in the main formalization. As Table I shows, the core LF 
theory accounts for about 20% of the development. These syntactic properties are 
mostly straightforward, and their proofs merit only cursory discussion in HP05, 
but some lemmas have many cases which must each be handled individually. The 
Decidability theory accounts for another 15%; the quasidecidability proofs are 
verbose but largely straightforward. The LocallyN theory proves that the nom- 
inal datatypes version of LF is equivalent to a locally nameless formulation; this 
accounts for about 25% of the development. The effort involved in this part was 
therefore quite substantial: it can be explained by the lack of automatic infras- 
tructure for the locally nameless representation of binders in Isabelle/HOL, but 
also by the inherent subtleties when working with this representation. A number 
of lemmas need to be carefully stated, and in a few cases in rather non-intuitive 
ways. The remaining theories account for at most 5-10% of the formalization each; 
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Tabic I. Summary of the formalization 
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the WeakAlgorithm theory defines the weak algorithmic equivalence judgment and 
proves the additional properties needed for the third solution, and accounts for only 
around 2% of the total development. 

The merit of metrics such as proof size or number of lemmas is debatable. We 
have not attempted to distinguish between meaningful lines of proof vs. blank or 
comment lines; nor have we distinguished between significant and trivial lemmas. 
Nevertheless, this information should at least convey an idea of the relative effort 
involved in each part of the proof. 

Correctness of the representation. The facilities for defining and reasoning about 
languages with binding provided by the Nominal Datatype Package are convenient, 
but their use may not be persuasive to readers unfamiliar with nominal logic and 
abstract syntax. Thus, a skeptical reader might ask whether these representations, 
definitions and reasoning principles are really correct; that is, whether they are 
equivalent to the definitions in HP05, as formalized using some more conventional 
approach to binding syntax. For higher-order abstract syntax representations, this 
property is often called adequacy; this term appears to have been coined in the con- 
text of LF [Harper et al. 1993], due to the potential problems involved in reasoning 
about higher-order terms modulo alpha, beta and eta-equivalence. 

Adequacy is also important for nominal techniques and deserves further study. 
We believe that the techniques explored in existing work on the semantics of nomi- 
nal abstract syntax and its implementation in the Nominal Datatype Package [Gab- 

ACM Transactions on Computational Logic, Vol. V, No. N, Month 20YY. 



36 • C. Urban et al. 



bay and Pitts 2002; Pitts 2003; Cheney 2006; Pitts 2006; Urban 2008] suffices for 
informally judging the correctness of our formalization. There has also been some 
prior work on formalizing adequacy results for nominal datatypes via isomorphisms. 
Urban [2008] proves a bijective correspondence between nominal datatypes and a 
conventional named implementation of the A-calculus modulo a-equivalence. Nor- 
rish and Vcstcrgaard [2007] have formalized isomorphisms between nominal and 
de Bruijn representations, and they provide further citations to several other iso- 
morphism results. Our proof of equivalence to a locally nameless representation 
described in Sec. 4 also gives evidence for the correctness of the nominal datatype 
representation. 

In any case, our formalization has exposed some subtle issues which make sense 

in the context of LF, independently of whether or not nominal datatypes in Is- 
abelle/HOL really capture our informal intuitions about abstract syntax with bind- 
ing. 

Reflecting on formalizing LF. It has been observed (as discussed, for example, by 

Pientka [2007]) that the process of formalization can suggest changes that both case 
formalization and clarify the original system. Likewise, our formalization provides 
a basis for reflecting on how the LF metatheory might be adapted to make it easier 
to formalize. Most obviously, many of the problems we encountered with soundness 
disappear if we simply add the omitted extensionality rule or change the equivalence 
algorithm. 

A more subtle complication we encountered was that since the algorithmic rules 
in HP05 do not enforce well-formedness, it is not even guaranteed that a variable 
appearing in one of the terms being compared also appears in the context A. This 
necessitates extra freshness conditions on many rules and induction hypotheses to 
ensure that strong nominal induction principles can be used safely. Building these 
constraints into the algorithmic rules might make several of the proofs about the 
equivalence algorithm cleaner. 

Another practical consideration was that the syntax and rules of LF in HP05 
exhibit redundancy, which leads to additional (albeit straightforward) formalization 
effort. For example, constants, dependent products, and applications each appear 
at more than one level of the syntax, resulting in proofs with redundant cases. 
Similarly, because objects, kinds and types are defined by mutual recursion, each 
inductive proof about syntax needs to have three inductive hypotheses and ten 
cases. Likewise, any proof concerning the definitional judgments needs to state 
eight simultaneous induction hypotheses and thirty- five cases. Collapsing the three 
levels of LF syntax into one level, and collapsing the many definitional judgments 
into a smaller number could make the formalization much less verbose, as in Pure 
Type Systems [McKinna and Pollack 1999], at the cost of increasing the distance 
between the paper version and the formalization. On the other hand, such an 
approach could also make it easier to generalize proofs about LF to richer type 
theories. 

6. RELATED AND FUTURE WORK 

McKinna and Pollack [1999] 's LEGO formalization of Pure Type Systems is prob- 
ably the most extensive formalization of a dependent type theory in a theorem 
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prover. Their formalization introduced the locally nameless variant of de Bruijn's 
name-free approach [de Bruijn 1972] and considered primarily syntactic properties 
of pure type systems with /3-cquivalence, including a proof of strengthening. Pol- 
lack [1995] subsequently verified the partial correctness of typechecking algorithms 
for certain classes of Pure Type Systems including LF. 

Completely formalizing metatheoretic and syntactic proofs about languages and 
logics with name-binding has been a long-standing open problem in computational 
logic. We will not give a detailed survey of all of these techniques here, but men- 
tion a few recent developments. In the last five years, catalyzed by the POPLMark 
Challenge [Aydemir et al. 2005], there has been renewed interest in this area. Ay- 
demir et al. [2008] have developed a methodology for formalizing metatheory in 
Coq using the locally nameless representation to manage binding, and using cofinitc 
quantification to handle fresh names. Chlipala's parametric higher-order abstract 
syntax is another recently developed technique for reasoning about abstract syntax 
in Coq, and has been applied to good effect in reasoning about compiler transfor- 
mations [Chlipala 2008]. West brook et al. [2009] are developing CINIC, a variant 
of Coq that provides built-in support for nominal abstract syntax (generalizing 
a simple nominal type theory developed by Cheney [2009]). Gacek et al. [2008] 
have developed Abella, a proof assistant for reasoning about higher-order abstract 
syntax, inductive definitions, and generic quantification (similar to nominal logic's 
fresh-name quantifier) . Schiirmann and Sarnat [2008] have recently discovered tech- 
niques for performing logical relations proofs in Twelf [Pfenning and Schiirmann 
1999]. Formalizing the results in this article using these or other emerging tools 
would provide a useful comparison of these approaches, particularly concerning 
decidability proofs, which ought to be easier in constructive logics. 

Algorithms for equivalence and canonicalization for dependent type theories have 
been studied by several authors. Prior work on equivalence checking for LF has 
focused on first checking well-formedness with respect to simple types, then /3- or 
/3ry-normalizing; these approaches are discussed in detail by Harper and Pfenning 
[2005]. Coquand's algorithm [1991] is similar to Harper and Pfenning's but op- 
erates on untyped terms. Goguen's approach [2005b] involves first type-directed 
ry-expansion and then /3-normalization, and relies on standard properties such as 
the Church- Rosscr theorem, strong normalization of /3-rcduction and strengthening. 
Gogucn [2005a] extends this proof technique to show termination of Coquand's and 
Harper and Pfenning's algorithms, and gives a terminating type-directed algorithm 
for checking /JTj-equivalcnce in System F. It may be interesting to formalize these 
algorithms and proofs and compare with Harper and Pfenning's proof. 

Our formalization provides a foundation for several possible future investigations. 
We are interested in extending our formalization to include verifying Twelf-style 
meta-reasoning about LF specifications, following Harper and Licata's detailed in- 
formal development of Canonical LF [2007]. Doing so could make it possible to 
extract Isabelle/HOL theorems from Twelf proofs, but as discussed earlier, formal- 
izing Canonical LF, hereditary substitutions, and the rest of Harper and Licata's 
work appears to be a substantial challenge. 

It would also be interesting to extend our formalization to accommodate ex- 
tensions to LF involving (ordered) linear logic, concurrency, proof-irrelevance, or 
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singleton kinds, as discussed by Harper and Pfenning [2005, Sec. 8]. We hope that 
anyone who proposes an extension to LF will be able to use our formalization as a 
starting point for verifying its metatheory. 



7. CONCLUSIONS 

LF is an extremely convenient tool for defining logics and other calculi involving 
binding syntax. It has many compelling applications and underlies the system 
Twelf, which has a proven record in formalizing many programming language cal- 
culi. Hence, it is of intrinsic interest to verify key properties of LF's metatheory, 
such as the correctness and decidability of the typcchccking algorithms. We have 
done so, using the Nominal Datatype Package for Isabcllc/HOL. The infrastruc- 
ture provided by this package allowed us to follow the proof of Harper and Pfenning 
closely. 

For our formalization we had the advantage of working from Harper and Pfen- 
ning's carefully-written informal proof, which withstood rigorous mechanical for- 
malization rather well. Still we found in this informal proof one gap and numerous 
minor complications. We have shown that they can be repaired. We have also 
partially verified the decidability of the cqiiivalcncc and typcchccking algorithms, 
although some work remains to formally prove decidability per se. Formalizing 
decidability proofs of any kind in Isabelle/HOL appears to be an open problem, so 
we leave this for future work. 

While verifying correctness of proofs is a central motivation for doing formaliza- 
tions, it is not the only one. There is a second important benefit — they can be 
used to experiment with changes to the system rapidly. By replaying a modified 
formalization in a theorem prover one can immediately focus on places where the 
proof fails and attempt to repair them rather than re-checking the many cases that 
are unchanged. This capability was essential in fixing the soimdness proof, and 
it illustrates one of the distinctive advantages of performing such a formalization. 
Had we attempted to repair the gap using only the paper proof, experimenting with 
different solutions would have required manually re-checking the roughly 31 pages 
of paper proofs for each change. 

Our formalization is not an end in itself but also provides a foundation for further 
study in several directions. Researchers developing extensions to LF may find 
our formalization useful as a starting point for verifying the metatheory of such 
extensions. We plan to further investigate hereditary substitutions and adequacy 
proofs in LF and Canonical LF. More ambitiously, we contemplate formalizing the 
meaning and correctness of metatheoretic reasoning about LF specifications (as 
provided by the Twelf system) inside Isabelle/HOL, and extracting Isabelle/HOL 
theorems from Twelf proofs. 

ELECTRONIC APPENDIX 

The electronic appendix for this article can be accessed in the ACM Digital Library 

by visiting the following URL: http:/ /www. acm.org/pubs/citations/journals/tocl/20YY-V-N/pl-URLend. 
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A. FULL STATEMENTS OF SYNTACTIC RESULTS 

Lemma 1 (Freshness). 

(i) // h r sig then x # S. 

{2) If r ctx then x # S. 

(3) If r M : A and x # T then x M and x # A. 

(4) If rhs A: K and .t # T then x A and x # K. 

(5) If r \-s K '■ kind and x ^ F then x ij^ K. 

(6) If r \-E M = N : A and X # r then x # M and x # N and x # A. 

(7) If r 'rs A ^ B : K and X # r then x # A and x # B and x # K . 

(8) If r K = L : kind and x ^ F then x # K and a; # L. 

Lemma 2 (Implicit Validity). 
{!) If \-E r ctx then h E sig. 
{2) If r 'rs M : A then \-e F ctx and h E sig. 
(5) If F \-s A : K then \-s F ctx and h E sig. 
{4) If F \-s K : kind then \-x; F ctx and h E sig. 
(5) If F 'rs M = N : A then 'rs F ctx and h E sig. 
{6) If F A = B : K then F ctx and h E sig. 
(7) If F \-s K = L : kind then \-z; F ctx and h E sig. 

Lemma 3 (Implicit Validity). If F M : A then \- E sig and F ctx. 

Lemma 4 (Weakening). Suppose \-e F2 ctx and Fi C F2. 
{!) If Fi^E M : A then M : A. 

{2) If Fi^s A: K then F2^s A: K. 
{3) If Fx^s K : kind then F2\-e K : kind. 
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{4) If Ti \-s M = N : A then r2 M = N : A. 
(5) If Ti^s A = B : K then r2^E A = B : K. 
(5) If Ti^E K = L : kind then hi; iiT = i : kind. 

Lemma 5 (Substitution). Suppose r2 l-j; P : CandletF = Fi @[{y, C)] @r2. 
{!) If r ctx then ri[y:=P] ctx. 

(2) If r\-j: M : B then ri[y:=P] QF^ M[y:=P] : B[y:=P]. 

(3) If Fhs B : K then ri[y:=P] @F2 B[y:=P] : K[y~P]. 
{4) If F^s K : kind then F^[y:=P] @F2 K[y:=P] : kind. 

(5) If F\-s M = N : A then Fi[y:=P] @F2 hj: M[y:=P] = N[y:=P] : A[y:=P]. 
W) If F A = B : Kthen Fi[y:=P] @F2 A[y:^P] ^ B[y:^P] : K[y:^P]. 
(7) If F K = L: kind then Fi[y:^P] @F2 K[y::^P] ^ L[y:^P] : kind. 

Lemma 6 (Context Conversion). Assume that F \-s B : type and F \-s 
A = B : type. Then: 

(1) If {x, Ay.:F M : C then {x, B)::F M : C 
{2) If {x, A)::F C : K then {x, B)::F hs C : K 

(5) // (x, A)::F hj: K : kind then {x, B)::F hj: K : kind 

(4) If {x, A)::F C = D : K then {x, B)::F C = D : K 

(5) // {x, A)::F K = L : kind then {x, B)::F K ^ L : kind 

Lemma 7 (Functionality for Typing) . Assume that F \-s M : C and F 
N ■ C and F^E M = N : C. Then: 
{1) If F'@[{y, C)] @F\-E P : B then F'[y:=M] @F ^e P[y:=M] = P[y:=N] : 
B[y:=M] 

(2) If F'@[{y, C)] @FhE B : Kthen F'[y:=M] @F "te B[y:=M\ = B[y:=N] : 
K[y:=M] 

(5) // F'(§[{y, C)] @F\-E K : kind then F'[y:=M] @F K[y:=M] = 
K[y:=N] : kind 

Lemma 8 (Validity). Objects, types and kinds appearing in derivable judg- 
ments are valid, that is 
(1) If F^E M : A then F^e A: type. 
{2) If F^E A: K then F ^e K : kind. 

{3) If F^E M = N : B then F ^e M : B and F ^e N : B and F ^e B : type. 

(4) If F^E A = B : K then F \-e A : K and F ^e B : K and F \-e K : kind. 

(5) If F \-E K = L : kind then F \-e K : kind and F \-e L : kind. 

Lemma 9 (Typing inversion). The validity rules are invertible, up to conver- 
sion of types and kinds. 

(-?) If F^E X : A then 3B. {x, B) € F and F ^e A = B : type. 
{2) If F^E c : A then 3B. (c, B) G E and F \-e A = B : type. 

(3) If F ^E Ml M2 : A then 3x Ai A2. F \-e Mi : Ilx:A2. Ai and F \-e M2 : 
A2 and F \-E A = Ai[x:=M2] : type. 

(4) If F ^E Xx:A. M : B and X F then 3A'.F^eB = Ylx:A. A' : type and 
F \-E A : type and (x, A)::F \-e M : A'. 

(5) // F \-E Ux-.Ai. A2 : K and X # F then F \-e K = type : kind and F \-e 
Al : type and {x, Ai)::F \-e A2 : type. 

(6) If F\-E c: K then 3L. (c, L) e S and F \-e K = L : kind. 
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(7) If r A M : K then 3x Al K2. T A : nx:Al. K2 and F M : Al 

and r^s K = K2[x:=M] : kind. 
{8) If r \-s Ux-.Ai. K2 : kind and x # F then F Ai : type and {x, Ai)::F 

\-s K2 '■ kind. 

Lemma 10 (Equality inversion). 
{!) If F type = L : kind then L = type. 

(2) If F L = type : kind then L = type. 

{3) If r\-s A = Hx-.Bi. B2 : type and x # F then 3Ai A^. A = Ux-.A^. A2 

F \-s Ai = Bi : type and {x, Ai)::F \-s A2 = B2 : type. 
{4) If F \-E Ux:Bi. B2 = B : type and x if F then 3Ai A2. B = ILx:Ai. A2 

F \-s Ai = Bi : type and {x, Ai)::F \-s A2 = B2 : type. 
(5) If F \-s K = Ilx:Bi. L2 : kind and x # F then 3Ai K2. K = Tlx:Ax. K2 

and F \-s Ai = Bi : type and (x, Ai)::F \-s K2 = L2 : kind. 
{6) If F hi; Ux-.Bi. L2= L: kind and x # F then 3Ai K2. L = Ilx:Ai. K2 and 

F \-s A\ = B\ : type and {x, Ai)::F \-s K2 = L2 : kind. 

Lemma 11 (Product injectivity) . Suppose x # F. 
{!) If F lix-.Ai. A2 = Tlx-.Bi. B2 : type then F Ai = Bi : type and {x, 

Aiy.:F hi: A2 = B2 : type. 
{2) If F hi; Ux:A. K = Ilx:B. L : kind then F A = B : type and {x, A)::F 

\-E K = L : kind. 

Lemma 12 (Strong versions of rules). The following rules are admissi- 
ble: 



{!) 
{2) 
(3) 
(4) 
(5) 



r hi; Ml : Ilx:A2. Ai F \-s M2 : A2 

F Ml M2 : Ai[x:=M2] 
r hi; ^ : Ux:B. K F M : B 

Fhs A M : K[x:=M] 
jx, Ai)::F hi; M2 = #2 : ^2 F Mi = Ni : Ai x#F 

F hs {Xx:Ai. M2) Ml = N2[x:=Ni] : A2[x:=Mi] 
Fhs Ai = Bi: type {x, Ai)::F ^2 = ^2 : type x#F 

F n,T:yli. A2 = Iix:Bi. B2 : type- 
F h^- A = D : lype (x, A)::r h v K = L : kind x # F 
F hi; nx:A. K = nx:B. L : kind 
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